Method, data processing device and computer program product for processing data

ABSTRACT

The invention relates to a method for for data processing, to be run on a data processing device, for the mapping of input data to output data, where data objects to be processed are entered as input data, the entered data objects are processed, by using a topology-preserving mapping, by ordering of neurons in the ordering space, according to a given pattern, assigning of codebook objects in the outcome space to the neurons processing of codebook objects according to the calculation rule of a topology-preserving mapping, by use of data objects of the exploration space, output of the processed codebook objects as output data. The characteristics of this method are that at least a part of the entered data objects is used to determine the order of neurons in the ordering space, and/or data objects, required for the data processing and independent of the input data to be processed, are entered, which are used as data objects of the exploration space. The invention further relates to a method for data processing, to be run on a data processing device, for the mapping of data objects to be processed to distance objects, where data objects to be processed are entered, distances between the data objects to be processed are calculated as distance objects, these distance objects are delivered as output data. The characteristics of this method are that the distances are calculated by use of statistical learning methods, local models, methods of inferential statistics, and/or one of the following specific computation methods: Levenstein Measure, Mutual Information, Kullback-Leibler Divergence, coherence measures employed in signal processing, specifically for biosignals, LPC cepstral distance, calculation methods that relate the power spectra of two signals, such as the Itakura-Saito Distance, the Mahalanobis-Distance, and/or calculation methods relating to the phase-synchronization of oscillators. The invention further relates to a method for data processing, to be run on a data processing device, for the determination of the cluster validity, where data objects are entered, distance objects between these data objects are entered and/or calculated, and an assignment of the data objects to be processed to groups is entered and/or calculated, specifically according to a method as set forth in one of the claims  1  to  5 , and a measure of the quality of this assignment is delivered as output data, thereby characterized that the measure of the quality of the assignment is calculated employing at least a part of the entered and/or calculated distance objects. Finally, the invention relates to corresponding data processing devices and computer program products as well.

1 BACKGROUND OF THE INVENTION

The presented invention refers to a method for data processing,according to the general specification in claim 1, for the mapping ofraw input data onto output data, in particular for learning oftopology-preserving mappings by self-organization with numerousapplications to data processing and analysis. It further refers toprocesses for data processing according to the general specifications ofclaims 6 and 7. Finally, it refers to data processing devices andcomputer program products related to that methods.

Although the concepts used here are independent of any specific modelconception, it is useful for the understanding of the present inventionto lead their description by basic definitions from the field of neuralinformatics. By this way, clear interpretations regarding the dynamic oflearning in neural networks can frequently be established.

For an introduction to neural informatics, the reader is referred torelevant standard literature, e.g. [20], [36].

For the technical understanding of topology-preserving mappings, it isuseful to build on definitions of data partitioning by vectorquantization. In this context, the description follows, besides others,[45], [46].

1.1 Vector Quantization

If a data set X={x}, where xε

^(n), is to be characterized by a set C of so called codebook vectorsw_(j), C={w_(j)ε

^(n)|jε{1, . . . ,N}}, this problem is called vector quantization (VQ).Hereby, the codebook C should represent the statistical structure of adata set X, with a probability density off: ^(n)→[0,1], x

f(x)in a suitable way, whereby “suitable” can be defined in different waysregarding specific objectives. Typically, the number N of codebookvectors will be substantially smaller than the number #C of data points.For the numerous application fields of VQ, such as analysis andcompression of large amounts of data, please refer e.g. to [17].

VQ methods are also often referred to as Clustering processes. Bothterms will be used as synonyms in the following.

In VQ, one discerns between a so-called hard clustering, where each datapoint x is assigned to exactly one codebook vector w_(j), and aso-called fuzzy clustering, where a data point x can be mapped, in asuitable way, to several codebook vectors w_(j).

FIG. 1 shows schematically a neural net as a model of a vectorquantizer. It is composed of two layers: an input layer and an outputlayer. Based on n input cells with the activities x_(i), iε{1, . . . ,n}, the activity pattern in the input layer represents a data point x inthe so-called feature space

^(n). Through directional connections that are weighted with the weightsw_(ji), this activity is passed onto the N cells of the output layer.These cells of the output layer correspond to the codebook neurons. Theconnection

weights—i.e. in the neural context the strength of the synapses—w_(j)ε

^(n), jε{1, . . . ,N} are hereby chosen so that the activity a_(j) of aneuron j on the output layer depends, in a suitable way, on the distanced=∥x−w_(j)∥ of the data point x from the virtual position w_(j) of thecodebook neuron j. d hereby defines any distance measure in the featurespace. The term “virtual position” is hereby based on the idea that theactivity a_(j) of the codebook neuron should amount to its maximum valuefor x_(max)

w_(j), which can be interpreted as a “specialization” of the neuron j tothe position x_(max).

After the training of the vector quantizer has been completed, an inputsignal x can be represented by the activations a_(j)(x) of the codebookneurons j, whereby the connection weights of the codebook neuron j tothe input layer can be combined to form the codebook vector w_(j).

Some VQ algorithms can be generally characterized as iterative,sequential learning processes. Hereby, initially, the number N ofcodebook vectors w_(j) is determined, and these are initialized. In thefollowing, typically, a data point xεX will be randomly chosen and thecodebook vectors will be repeatedly updated according to the general,sequential VQ learning rulew _(j)(t+1)=w _(j)(t)+ε(t))(t,x,C)(x(t)−w _(j)(t)).  (1)t describes the updating step, ε a freely chosen learning parameter, andψ, the so-called cooperativity function. Typically, the learningparameter ε is chosen monotonically decreasing for consecutive updatesteps. Due to analogies to systems of the statistical physics, this isoften called “cooling”. Frequently, an exponential cooling strategy isused: $\begin{matrix}{{{\varepsilon(t)} = {{\varepsilon(0)}( \frac{\varepsilon( t_{\max} )}{\varepsilon(0)} )^{\frac{t}{t_{\max}}}}},{t \in {\lbrack {0,t_{\max}} \rbrack.}}} & (2)\end{matrix}$Besides the specifically chosen heuristics for the determination of thetime dependence of ε and ψ, numerous VQ methods essentially differ inthe definition of the cooperativity function ψ. A simple method for hardclustering is, e.g., given by the LBG-Algorithm of Y. Linde, A. Buzo andR. Gray [25]. Hereby, ψ selects, in each learning step, one and only onecodebook vector w_(j) to be updated, according toψ(t,x,C):=δ_(i(x),j,)  (3)whereby i(x) is defined out of the minimum distance${{{x - w_{i}}} = {\min\limits_{j}{{x - w_{j}}}}},$and δ_(i(x),j) denotes the Kronecker's delta. Because one and only onecodebook vector participates in each learning step, this is also calleda winner-takes-all learning rule. If otherwise ψ is chosen in a waythat, in each learning step, several codebook vectors take part in theupdate, then equation (1) defines a winner-takes-most learning rule.Depending on the definition of ψ, different methods for a so calledfuzzy clustering result from this.1.2 Self-Organizing MapsA classical method of neural network computation is the Self-OrganizingMap Algorithm (SOM) described by T. Kohonen, e.g. in [24]. Seen inrelation to the notes above, this algorithm can be interpreted as a VQmethod as well.

Hereby, the choice of the reference space of the metric, on which thecooperativity function ψ in equation (1) is based, is of essentialimportance. In the self-organizing map algorithm, as well as in othertopology-preserving mappings, the metric of the cooperativity function ψrefers to a target space that is independent of the source space.

The terms source space and target space are to be seen in relation tothe mappingj: ^(n)→

^(N), x

a _(i)(x)  (4)of the data points to the activations of the codebook neurons with thespecifications of FIG. 1: The source space is generally identical to thefeature space as defined above, e.g. to

^(n). In self-organizing maps, the target space can be interpreted forinstance as a space of the physical positions r_(j) of the codebookneurons j, according to a mappingr:

→

^(k), j

r(j).  (5)For the scientific discovery of the self-organizing map algorithm, theinterpretation in connection to neurophysiological model concepts wasessential. For this reason, the target space, i.e. the space of ther_(j):=r(j) is often referred to as model cortex. A typical case is, forinstance, the ordering of N codebook neurons on a two dimensionaldiscrete periodical grid (i.e. k=2), in form of a sensorial map, whichshould represent the input from n sensory cells. To this, there arenumerous biological examples, e.g. the retinotopic projection of fishesand amphibians [12]. Here, Kohonen found a heuristics, “where theneurons j of the model cortex coordinate their sensitivity to inputsignals x, in a way that their response behavior to signalcharacteristics varies, in a regular way, along with their position onthe model cortex” (freely quoted according to [36]). For theneurophysiological motivation, as well as for the mathematicaldefinition, please refer to [36].

Here, the physical position r of the codebook neurons determines themetric of the cooperativity function ψ. In contrast to this issue, itsconcrete choice as a Gaussian function $\begin{matrix}{{{\psi( {r,{r^{\prime}( {x(t)} )},{\sigma(t)}} )}\text{:}} = {\exp( {- \frac{( {r - {r^{\prime}( {x(t)} )}} )^{2}}{2{\sigma(t)}^{2}}} )}} & (6)\end{matrix}$or e.g., as a characteristic function on a k-dimensional hyperspherearound r′(x(t)) $\begin{matrix}{{{\psi( {r,{r^{\prime}( {x(t)} )},{\sigma(t)}} )}\text{:}} = {\chi_{{{r - {r^{\prime}{({x{(t)}})}}}} \leq {\sigma{(t)}}}\text{:} = \{ \begin{matrix}{1:{{{r - {r^{\prime}( {x(t)} )}}} \leq {\sigma(t)}}} \\{0:{{{r - {r^{\prime}( {x(t)} )}}} > {\sigma(t)}}}\end{matrix} }} & (7)\end{matrix}$is, in contrast, of minor importance. In this context, according to∥x−w _(r′)∥=min∥x−w _(r)∥,  (8)r′(x(t)) defines, for a given stimulus x(t)ε

^(n) the neuron with the highest activity, the so called “winnerneuron”. For characterizing a codebook neuron, its physical position,according to (5), is used directly. Thus, the learning rule (1) becomesw _(r)(t+1)=w _(r)(t)+ε(t)ψ(r,r′(x(t)),σ(t))(x(t)−w_(r)(t)).  (9)Here, σ(t) denotes the corresponding cooperativity parameters fromequation (6) and (7), respectively. It is a measure of the “stretch” ofthe neighborhood function ψ in the model cortex and is, just like thelearning parameter ε(t), usually modified during the learning process,according to a suitable heuristics, e.g. similarly to equation (2):$\begin{matrix}{{\sigma(t)} = {{{\sigma(0)}( \frac{\sigma( t_{\max} )}{\sigma(0)} )^{\frac{t}{t_{\max}}}\quad t} \in {\lbrack {0,t_{\max}} \rbrack.}}} & (10)\end{matrix}$From these definitions, the training of a self-organizing map, accordingto [36], can be described as a technical procedure as follows:

-   -   (i) Initialization:Choose suitable initial values for the        codebook vectors w_(j). In lack of any a-priori information, the        w_(j) can, e.g., be randomly chosen.    -   (ii) Stimulus Choice: Randomly choose a vector x among the        entered data in the feature space.    -   (iii) Response: Determine the winner neuron according to        equation (8).    -   (iv) Adaptation Step: Perform an adaptation step by modifying        the codebook vectors according to equation (9).    -   (v) Iteration: Repeat steps (ii)-(iv), until a suitable stop        criterion is fulfilled.

For further details of the self-organizing maps, please refer to [36],the disclosure of which is, by this reference, included in the presentapplication.

2 DETAILED DESCRIPTION OF THE INVENTION, PART I

The invention is thus based on the problem of improving data processing.

The invention solves this problem with the subjects of claims 1, 6, 7,16, and 17, respectively.

Further preferred variations of the invention are described in thesub-claims.

According to claim 1, in a genus-conform method, at least part of theentered data objects is used to determine the arrangement of neurons inthe ordering space. Alternatively or additionally, data objects requiredfor the data processing that are independent of the input data areentered, which are used as data objects of the exploration space.

According to claim 6, in a genus-conform method the distances arecalculated by statistical learning methods, local models, methods ofinferential statistics and/or one of the following special computationalmethods: Levenstein measure, Mutual Information, Kullback-Leiblerdivergence, coherence measures employed in signal processing,specifically for biological signals, LPC cepstral distance, calculationmethods that relate the power spectra of two signals to each other, suchas the Itakura-Saito distance, the Mahalanobis distance, and/orcalculation methods relating to the phase synchronization ofoscillators.

According to claim 7, in a genus-conform method the measure of thequality of the assignment is calculated employing at least a part of theentered and/or calculated distance objects.

Regarding the claim concerning a data processing device, it should bementioned that the term “data processing device” includes, besides thepresently common ones (e.g. semiconductor-technology based computingsystems), also all future realizations of data processing devices (e.g.quantum computers, biological, molecular, nuclear, optical, or on anybiological, chemical, or physical principles of data processing basedrealizations, etc.).

Concerning the claim related to the computer program product, it shouldbe mentioned, that by the term “computer program product” a computerprogram or a computer program module is meant, which is embodied bystorage (e.g. on a magnetic storage medium or in a volatile ornon-volatile semiconductor memory of a computer) or by signalstransmitted in a network, specifically in the internet. In this context,the computer program does not have to be available on an immediatelyexecutable form, but can also exist in a form prepared for installationin the data processing device, where, of course, it can be compressed,coded, broken up into packets and provided with headers for an eventualtransmission through a network, etc.

The invention, as well as further characteristics and advantages of theinvention, will now be described closer based on preferred realizationexamples.

First, the construction of a computer system, as a special realizationexample of a data processing device, will be roughly explained. Usually,such a computer system includes a computer, a monitor, an inputkeyboard, and a computer mouse. In place of the monitor, any otherdisplay devices can be used, as for instance a projector. In place ofthe mouse, also any other cursor-positioning device can be used, as forinstance a track-ball, a touchpad, a mouse stick, a touch screen, orcursor keys of a computer keyboard.

The computer has a first data storage device in form of a memory disc,such as a hard disk, CD or diskette, and a second data storage device inform of a main memory and/or a working memory. Information istransferred between the disc memory and the working memory. Thetransmission takes place, e.g. through usual interfaces and bus systems.The data processing is performed by a CPU (Central Processing Unit). Inthe disc memory, data are saved which the computer can get access to byappropriate control mechanisms. The computer further includes a networkcard, through which it can be connected, e.g., to a second computer.Moreover, the computer can include a so-called modem, through which itcan be connected, over the telephone network and its respectiveprovider, to the internet. The computer can also be a part of a directPC connection, an additional computer of a computer cluster or a serverof a network.

2.1 General View on Topology-Preserving Mappings

The terms essential for the understanding of the invention are describedin the following.

The starting point is the data processing by use of so-called“topology-preserving mappings”. It refers to different, state-of-the-artdata processing methods. Important examples are: Self-Organizing Map(SOM) [24] (as described in section 1.2), Generative Topographic Mapping(GTM) [4, 2], Neural Gas Algorithm [28], different forms of topographicvector quantizers (e.g. Topographic Vector Quantizer (TVQ), SoftTopographic Vector Quantizer (STVQ), Soft Self-Organizing Map (SSOM),Kernel-Based Soft Topographic Mapping (STMK), Soft Topographic Mappingof Proximity Data (STMP)) [13, 14], as well as numerous variants of thecited methods.

In spite of this diversity, topology-preserving mappings have essentialcommon functional and structural components that are characterized inthe following definitions.

2.1.1 General Definitions

1. Data Objects: Any data without any restrictions, such as sets,numbers, vectors, graphs, symbols, texts, images, signals, mathematicalmappings and their representations, e.g. matrices, tensors etc., as wellas any combination of data objects

2. Space: Any set of data objects, e.g. also a subset and or a supersetof a set of data objects.

2.1.2 Functional Definitions

Input Data: Here, raw data are here any data objects to be supplied tothe data processing, e.g. sets, numbers, vectors, graphs, symbols,texts, images, signals, mathematical mappings and their representations,etc. These raw data serve directly as input data or are transformed intoinput data by suitable calculation procedures. In the following,therefore, it will not be distinguished between raw data and input data,only the term input data will be used. It is essential that the inputdata comprise those data objects, for which there exists a problem forthe data processing related to the topology-preserving mapping, whichshould, so, be analyzed, visualized or in any way processed. Typicalproblems for the data processing of these input data are e.g.partitioning, clustering, embedding, principal component analysis,approximation, interpolation, extrapolation, dimension determination,visualization, control, etc. For the definition of the input data, twoaspects are thus essential: input data are (i) given data objects ordata objects calculated from given data objects, for which there existsa problem, i.e. “something given that something should be done with”.

Structure Hypotheses: These are assumptions, e.g. about the structure ofthe input data. Structure hypotheses are assumptions that cannot becalculated from the input data without additional data objects that areindependent of the input data of the topology-preserving mapping. Thismeans, the assumptions

-   -   (i) are postulated ad hoc, whereby the hypotheses are chosen        independently of the input data, or    -   (ii) are postulated ad hoc, whereby the hypotheses are        influenced, but not completely determined, by predictable        characteristics of the input data, or    -   (iii) can only be calculated by processing of the input data        taking into account the topology-preserving mapping itself, i.e.        by taking into account output data (refer to definition below),        or are made by any combination of these procedures. Here, (iii)        is a special case of (ii), in so far as a topology-preserving        mapping requires structure hypotheses. Typical examples of the        formation of structure hypotheses are:    -   Ad (i) Choice of the grid topology in Kohonen's algorithm as a        two-dimensional quadratic grid with a given number of grid nodes        for both dimensions, independently of the input data,    -   Ad (ii) Choice of the grid topology in Kohonen's algorithm as a        two-dimensional quadratic grid with a given total number of        nodes (ad-hoc components of the hypothesis), whereby the        relation of the number of grid nodes for each dimension takes        into consideration the relation of the variations of the input        data distribution along both main distribution directions, as        can be determined by the calculation of the two largest        eigenvalues within a principal component analysis of the input        data distribution (data-driven component of the hypothesis)    -   Ad (iii) Choice of the grid topology depending on the data        representation by the topology-preserving mapping itself, e.g.        in growing self-organizing maps [44]; choice of the grid        topology, depending on the topology induced by the distribution        of the codebook vectors, e.g. the topology induced in the case        of a minimal spanning tree of the codebook vectors [24], or the        topology induced by an ordering metric of the codebook vectors        in the Neural Gas Algorithm [28].        Structure hypotheses are thus data objects that are required for        the data processing and independent of the input data to be        processed. “Independent” means that there is no calculation        method by which these data objects can be calculated by using        only input data of the topology-preserving mapping, i.e. without        referring to structure hypotheses.

Output Data: These are data objects that can be interpreted as a resultof the processing of the input data by the topology-preserving mapping.These are typically

-   -   (i) codebook objects and quantities calculated from them, see        definition below    -   (ii) structure hypotheses, motivated by codebook objects or        values calculated from them.        Output data are, by no means, only codebook objects or structure        hypotheses after a completed training of a topology-preserving        mapping, but can correspond to any training level of the        topology-preserving mapping.        2.1.3 Structural Definitions

Exploration Space: Space of the set of data objects, with which thetopology-preserving mapping is trained, i.e. which are presented, i.e.which are entered in the calculation rule of the topology-preservingmapping for calculating the codebook objects (e.g. x in section 1.2).These data objects are following called exploration objects and, forhistorical reasons implied by the technical standard, also synonymouslycalled feature vectors. Note that, according to the technical standard,these data objects correspond to the input data of thetopology-preserving mapping. An essential aspect of the invention isthat this correspondence is removed.

Ordering Space: Space of the set of data objects that define topologicalrelations, whereby these relations are used for the calculation of theoutput data of the topology-preserving mapping, e.g. in a cooperativityfunction, according to equation 9. An important example of an orderingspace is the model cortex in Kohonen's algorithm, also called grid spaceor index space. The data objects of the ordering space are, in thefollowing, called ordering objects or neurons. According to thetechnical standard, these are vectors in metric spaces. For historicalreasons implied by the technical standard, in the following, they arealso called grid vectors or position vectors. A further importantpartial aspect of the invention is the generalization of the termordering objects towards any data objects, e.g. distance objects betweendata objects (definition of the term distance object, see below). Notethat, according to the technical standard, the data objects of theordering space are determined by structure hypotheses. An essentialaspect of the invention is to remove this correspondence. To clearlydistinguish between the terms ordering space and exploration space, thedefinitions given for different topology-preserving mappings describedin the literature are listed in section 2.3.

Outcome Space: Space of the set of data objects assessed as result ofthe processing of input data (e.g. codebook vectors of a self-organizingmap). These are, in the following, called codebook objects.

The spaces above are not necessarily required to be different.Frequently, for example, the outcome space corresponds to theexploration space: This is the case in the self-organizing map insection 1.2. Here the feature vectors, in general, originate from thesame space as the codebook vectors. An opposite example would be theclustering of observation series by Hidden Markov Models (e.g. [34]):Here, specific Hidden Markov Models can correspond to single codebookobjects, while the observation series correspond to the explorationobjects. As a further example, the ordering space can also be definedimplicitly by the outcome space, e.g. as ordering metric for theNeural-Gas-Algorithm.

The central motivation of the present invention is based thus on adisentanglement of the functional and structural characteristics oftopology-preserving mappings listed above.

Here it is decisive that, according to the technical standard, the inputdata are exploration objects, i.e. are taken from the exploration spaceand do not define data objects of the ordering space. Further, thestructure hypotheses influence the ordering space and not theexploration space. According to the technical standard, the explorationspace is thus assigned to the input data and the ordering space isassigned to the structure hypotheses.

The central idea of Exploration-Organized Morphogenesis (XOM) now is thepartial reversion of these assignments.

XOM Definition: Method and device for data processing bytopology-preserving mappings, whereby, in contrast to the technicalstandard both the exploration space and the ordering space can bedetermined in any way by input data or structure hypotheses. Inparticular, in contrast to the technical standard, input data candetermine data objects of the ordering space and, reversely, structurehypotheses can determine data objects of the exploration space.

The statement that input data “determine” data objects of the orderingspace means here that there is a calculation method, with which dataobjects of the ordering space can be calculated from input data withoutusing structure hypotheses.

The statement that structure hypotheses “determine” data objects of theexploration space means here that there is no calculation method, withwhich these data objects can be calculated from input data without usingstructure hypotheses.

In contrast to the technical standard, the choice of the explorationspace is not subject to any limitations in so far as its data objectscan also be, besides input data, structure hypotheses. Reversely, incontrast to the technical standard, the choice of the ordering space isnot subject to any limitations, as its data objects can be, besidesstructure hypotheses, input data as well!

In contrast to the technical standard, the ordering space can thus beassigned to the input data, and the exploration space can be assigned tothe structure hypotheses.

A special aspect of the invention is, additionally, the generalizationof the term ordering objects as defined above, beyond the interpretationas vectors in metric spaces, as is technical standard, towards any dataobjects, especially distance objects between data objects.

Distance objects are defined here as data objects that characterizesimilarity relations or distances between data objects, according to anydistance measure. Here, both distance measures induced by metrics and,in particular, similarity relations or dissimilarities defined by anydistance measures that do not satisfy a metric are included. Sometypical distance measures based on dissimilarities are described forexample in [19]. Metric is here defined in the mathematical sense, ase.g. in [5].

In summary, as differentiation from the technical standard there is,with the definitions above, the following

2.2 Technical Description

The invention-conform method (XOM) for the mapping of input data to beprocessed to output data comprises the following steps:

-   -   The data objects to be processed are entered as input data.    -   The entered data objects are processed by means of a        topology-preserving mapping. For that    -   Neurons are ordered in the ordering space, where, according to a        first alternative, at least part of the entered data objects is        used for determining the ordering of neurons in the ordering        space.    -   Further, in doing so, codebook objects in the ordering space are        assigned to the neurons.    -   Finally, in doing so, codebook objects are processed, according        to the calculation rule of a topology-preserving mapping, by use        of data objects of the exploration space (refer for instance to        the technical procedure for the training of a self-organizing        map presented in the introduction of the description in section        1.2).    -   According to a second alternative, in doing so, data objects        (structure hypotheses) entered independently of the input data        to be processed are used as data objects of the exploration        space. The first and second alternative can be applied alone or        in combination.    -   In the end, the processed codebook objects are delivered as        output data.        2.3 Examples of the XOM Definition for Some Topology-Preserving        Mappings        The XOM definition, as defined above, will be described        exemplarily for some topology-preserving mappings described in        the literature. It should, however, be emphasized that the        invention is not limited to these examples, but can be applied        in analogy, by use of the above definitions, for the function        and structure components to any topology-preserving mappings,        even if those are not explicitly listed here. It should be        particularly emphasized that the invention is independent (i) of        the concrete choice of free parameters of topology-preserving        mappings, (ii) of the concrete choice of a cooperativity        function, e.g. in the sense of the function ψ according to        section 1.1, (iii) of the concrete choice of certain annealing        schemes, e.g. for learning parameters in the sense of ε in        section 1.1, (iv) of the kind of data presentation, i.e. if the        exploration or ordering data objects are presented sequentially        or in parallel in the sense of batch-algorithms, where in a        single training step, more than one data object can be        processed.        2.3.1 XOM for the Self-Organizing Map        Here, the input data can determine, according to the XOM        definition, the data objects of the ordering space, while        structure hypotheses can determine the characteristics of the        exploration space. To the self-organizing maps, in connection        with XOM, the following stipulations apply: The feature space of        the self-organizing map, according to section 1.1, corresponds        to the exploration space, the model cortex of the        self-organizing map, according to 1.1 corresponds to the        ordering space.

In the model cortex, according to section 1.2, thus input data arepresented, i.e. the model cortex is wholly or partially determined bythe input data. If the input data, for instance, are vectors Z in ak-dimensional metric space, i.e. Z={(z^(v))|z^(v)ε

^(k), vε{1, . . . , p}, k, pε

}, then the position vectors of the self-organizing map can be set equalto these. From this, in general, a topology of the ordering spacedetermined by the input data results, which, in contrast to thetechnical standard in the use of self-organizing maps, does notcorrespond to a discrete periodical grid. The training of theself-organizing map is then carried out with data objects of anyarbitrarily chosen exploration space. This exploration space cancorrespond to a structure hypothesis, or else be defined directly viainput data. With the conventions of section 1.2, for instance, thefollowing stipulation can thus be set: r^(v)=z^(v). As explorationspace, any set of data objects is then determined, e.g. data vectors onany manifold in

^(n), which satisfy for example a uniform distribution, a Gaussiandistribution, or any distribution described in probability theory. Anyother specification of the exploration space is, in principle,conceivable as well, where this specification may depend on the inputdata or, in the sense of a structure hypothesis, may not unequivocallybe computed directly from input data.

Besides the disentanglement of the structural and functional definitionsof the data spaces employed in topology-preserving mappings, asdescribed above, a special aspect of the invention consists,additionally, in the generalization of the term ordering objects definedabove, beyond the interpretation as vectors in metric spaces, as istechnical standard, towards any distance objects between data objects.In the case of self-organizing maps, this means that the topology of themodel codex can be defined by any dissimilarities, which do not have tosatisfy any metric in the mathematical sense. The ordering objects(neurons), thus, need not, represent vectors in k.

2.3.2 XOM for Generative Topographic Mapping

For the specification of the terms ordering space and exploration spacein the Generative Topographic Mapping (GTM), the following stipulationsare made, which refer to the publication [3]: The space called “latentspace” in [3], corresponds to the ordering space, its data objects arecalled “latent variables x” in this publication. The data objects of theexploration space are denoted by the variable t in [3].

2.3.3 XOM for Topographic Vector Quantizers

For the specification of the terms ordering space and exploration space,the following stipulations are made about the topographic vectorquantizers described in the literature (e.g. Topographic VectorQuantizer (TVQ), Soft Topographic Vector Quantizer (STVQ), SoftSelf-Organizing Map (SSOM), Kernel-Based Soft Topographic Mapping(STMK), Soft Topographic Mapping of Proximity Data (STMP)) [13, 14]),which refer to the publication [14]: The data objects called “nodes” in[14], with the variable designation r or s, correspond to data objectsof the ordering space. The data objects called “data vectors x(t)” in[14], correspond to the data objects of the exploration space.

By analogy, it is also possible to distinguish between exploration spaceand ordering space in other calculation schemes for topology-preservingmappings not described here.

3 DETAILED DESCRIPTION OF THE INVENTION, PART II

In the following listing, additional methods, devices, and applicationsto be protected by the patent are described.

1. (a) XOM Definition: Method and device for data processing by use oftopology-preserving mappings, whereby, in contrast to the technicalstandard, both the ordering space and the exploration space can bedefined freely by input data or by structure hypotheses. In particular,in contrast to the technical standard, the input data can define dataobjects of the ordering space and, vice versa, structure hypotheses candefine data objects of the exploration space.

The statement that input data “define” data objects of the orderingspace, means here that there is a calculation method, which allows thecalculation of data objects of the ordering space from input datawithout consideration of structure hypotheses.

The statement that structure hypotheses “define” data objects of theexploration space, means that there is no calculation method, by whichthese data objects can be calculated from input data withoutconsideration of structure hypotheses.

Unlike the technical standard, the choice of the exploration space isnot subject to any limitations, in so far as its data objects can be,besides input data, also structure hypotheses.

Hereby, it is explicitly not required that the data are uniformlydistributed on a single manifold in

^(n), but can distributed by any distribution in any data spaces.Examples of interesting distribution patterns are listed below in 5.During the training process, or in the context of a series of trainingprocesses of the topology-preserving mapping, these distributionpatterns can also be chosen dynamically variable, e.g. underconsideration of the output data or results supplied by thetopology-preserving mapping at the current or at an earlier state, likecodebook objects or the topology induced by these objects, whereby, inparticular, dynamical structure hypotheses can be generated. It shouldbe emphasized as well that the chosen distributions in the explorationspace may be influenced statically or dynamically by the input data.

Reversely, also the choice of the ordering space is, in contrast to thetechnical standard, not subject to any limitations, as the data objectscan be, besides structure hypotheses, input data as well!

In contrast to the technical standard, thus, the ordering space can beassigned to the input data and the exploration space can be assigned tothe structure hypotheses.

A special aspect of the invention is, additionally, the generalizationof the term ordering objects as defined above, beyond the interpretationas vectors in metric spaces, as is technical standard, towards any dataobjects, in particular distance objects between data objects.

Distance objects are defined here as data objects that characterizesimilarity relations or distances between data objects, according to anydistance measure. Here, distance measures induced by metrics, as wellas, in particular, similarity relations or dissimilarities defined byany non-metric distance measures, are included. Some typical distancemeasures based on dissimilarities are, for example, described in [19].Metric is here defined in the mathematical sense, refer e.g. to [5].

XOM can, in particular, also be used for data processing, if more thanone connected data distribution in the exploration space is used for thetraining; if no uniform data distribution in the exploration space isused for the training; if the data objects in the ordering space, orsubsets thereof, do not satisfy any metric in the mathematical sense; ifthe data distributions in the exploration space, used for training, arenot convex; if the data objects in the ordering space or in theexploration space, or subsets thereof, do not satisfy the Euclidiangeometry or their distance is defined by any dissimilarity measure; ifdistances of any data objects are used for the training, e.g. geodeticdistances or a rank metric; if the topology-preserving mapping does notcorrespond to the sequential formulation of a self-organizing map afterKohonen; if the distribution of the training data in the explorationspace, employed for the training of the topology-preserving mapping, hasa dimension other than 2 or 3; if the distribution of the training datain the exploration space, employed for the training of thetopology-preserving mapping is not a 3D sphere; if the training rule ofthe topology-preserving mapping can distinguish on its own for differentcodebook objects, see also item 1m; if not all connections ortopological relations, for which the distances are known or have beencalculated, are displayed for the visualization of the results.

(b) Irregular Structure of the Ordering Space: An essential aspect ofthe invention is that in XOM the limitation of the ordering space todiscrete periodical grids in

^(n), e.g. regular cubical or hexagonal grids, as it is technicalstandard, is removed, in particular, if input data are used to determinecharacteristics of the ordering space, for instance its topology and/orstructure hypotheses are used to determine the characteristics of theexploration space. Particularly protected is the use of ordering spaceswith a fractal local or global dimension.

(c) Combination of an Irregular Structure of the Ordering Space withXOM: It should be specifically emphasized that such methods and devicesare a special aspect of the invention, with which an irregular structureof the ordering space, according to 1b, is given and, at the same time,input data (and not only structure hypotheses) are used to determine thecharacteristics of this ordering space, e.g. to specify its topology.

(d) Determination of the Exploration Space by means of StructureHypotheses: Another essential aspect are methods and devices that usestructure hypotheses (and not only input data) to determine thecharacteristics of the exploration space.

(e) Arbitrary Distance Measures, e.g. Pairwise Dissimilarities: Aspecial aspect of the invention is the generalization of the termordering objects as defined above, beyond the interpretation as vectorsin metric spaces, as is technical standard, towards any data objects, inparticular, distance objects between data objects. This is of specialinterest if the ordering objects are defined by use of input data.

Distance objects are defined here as data objects that characterizesimilarity relations, or distances between data objects, according toany distance measure. Here, distance measures induced by metrics as wellas, in particular, similarity relations or dissimilarities defined byany non-metric distance measures, are included. Some typical distancemeasures on the basis of dissimilarities are, e.g., described in [19].Metric is here defined in the mathematical sense, as in [5]. Animportant example is the use of a rank metric (e.g. in analogy to thedefinition of the rank metric between the winner neuron and othercodebook neurons in the Neural-Gas-Algorithm). Some typical distancemeasures on the basis of dissimilarities are listed, e.g., in [19].Distances between data objects, i.e. distance objects can thus, inprinciple, be defined by any calculation methods or also by structurehypotheses.

It should also be stressed that it is not necessary for the invention,regarding a distance measure, to calculate all pairwise distancesbetween the input data objects, or ordering objects, or to use all ofthem for the training of the topology-preserving mapping. It is also notnecessary to define these distances for all pairs of data objects. It issufficient to use any subset of the calculable pairwise distances. Thissubset can be adapted to the current objective, or to the circumstancesof the data processing, eventually also dynamically. Such an adaptationis required in numerous situations, e.g. (i) in the so-called SparsenessAnnealing (refer to following), (ii) in the visualization of graphs,where not all edges between the nodes are known, or should be consideredin the calculation, (iii) in molecular-dynamics simulations, where dueto the constraints defined by the covalent structure of the molecule, orthe forces acting between the single atoms, only a proper subset of thepairwise distances between the atoms is defined or should be used forthe training of the topology-preserving mapping, (iv) in roboticsapplications, e.g. in the context of inverse kinematics, where e.g., dueto constraints, only a proper subset of the pairwise distances betweenthe robot articulations are defined or should be used for the trainingof the topology-preserving mapping.

Particularly interesting is the case of sparsely coded distancematrices.

(f) Non-Metric Ordering Spaces and Input Data Spaces: It should, oncemore, be specifically emphasized that, in contrast to the technicalstandard, methods and devices are included in the invention, whichemploy, in a mathematical sense, non-metrical distance measures for thedetermination of the topology of the ordering space, in which e.g. for aproper or improper subset of the pairwise distances, the symmetryrelation and/or the triangle inequality are not satisfied.

The ordering objects thus, in contrast to the technical standard, candefine a non-metric space, i.e. not corresponding to a metric spaceaccording to the definition in [5]. This partial aspect of the inventionis specifically protected in situations where not only structurehypotheses, but also input data are used to determine the topology ofthe ordering space.

(g) Non-Euclidian XOM: The ordering space, exploration space or outcomespace, or any combination of these spaces, can satisfy a non-Euclidian,or hyperbolic geometry.

(h) Local Neighborhoods, Acceleration by Fast Search of NearestNeighbors, Sparse Distance Matrices: For the training of thetopology-preserving mapping, specifically, only, or only partially, dataobjects from local neighborhoods of the data objects in the orderingspace and/or exploration space, i.e. thereby created sparsely codeddistance matrices, can be used. For this, in particular, accelerationstrategies for the definition of the local neighborhoods, e.g. for thesearch of the k nearest neighbors, can be used, according to thetechnical standard (refer to e.g. [30], [29] and literature cited inthese publications, as well as [9]) or to this patent application,specifically concerning items 5 and 12 below.

(i) Fractals: A special aspect of the invention is that, in methods anddevices according to the definition of XOM above, data distributions canbe used as ordering spaces, which, according to the literature, (e.g.[27] and the literature cited there, [16]), as well as to the dimensiondetermination methods described in this patent application, have locallyor globally a fractal dimension. Local, here and in the following, meansthat the dimension determination is carried out for single data objects,whereas global means that the dimension determination is carried out formore than one data object, e.g. for a complete data set. Reversely, datadistributions with a fractal dimension can also define the explorationspace. Specifically protected are methods and devices, where theordering space contains data distributions with a fractal dimension,whereby these data distributions are input data, as well as methods anddevices, where the embedding space comprises data distributions with afractal dimension, whereby these data distributions are structurehypotheses. Specifically protected is also the combination of bothpossibilities.

(j) Non-Orientable Surfaces, Möbius Scarf and Klein Bottle: The orderingspace, as well as the exploration space can contain data distributionsin which the topology induced by the data objects in the respectivespaces describe a non-orientable surface, in the sense of thedifferential geometry, e.g. a M″øobius scarf or a Klein bottle.Specifically protected are methods and devices, where the ordering spacecontains such data distributions, whereby these data distributions areinput data, as well as methods and devices where the exploration spacecontains such data distributions, whereby these data distributions are,structure hypotheses. Specifically protected is also the combination ofboth possibilities.

(k) Stochastic XOM: The ordering space as well as the exploration spacecan contain data distributions that result from a random experiment.Specifically protected are methods and devices, where the topologyinduced in the ordering space by the data objects is influenced by arandom experiment, or where the data objects of the exploration spaceare influenced by a random experiment in the sense of a structurehypothesis, as well as combinations of both possibilities.

(l) Addition or Omission of Data Objects in the Ordering Space: Based onthe definition of XOM, methods and devices can be constructed, where,during a training process, or before or during a series of trainingprocesses of the topology-preserving mapping, one or more new dataobjects, specifically also distance objects, are added to the orderingspace and the topology-preserving mapping is retrained partially orcompletely. Specifically, this method can be employed for interpolation,extrapolation, or approximation of new data objects by thetopology-preserving mapping. Reversely, data objects, specifically alsodistance objects, in the ordering space can be removed or freelymodified, before the topology-preserving mapping is partially orcompletely retrained. Specifically, measures of local or global mappingquality, e.g. in the sense of 2, can be used to create, remove or modifydata objects of the ordering space in a goal-directed way.

(m) Codebook-Object-Specific Variation of the Calculation Rule: Itshould be stressed that, based on the definition of XOM, specificallyalso methods and devices can be developed, where in the training of thetopology-preserving map ping not all the codebook objects belonging tothe data objects in the ordering space are trained according to the samecalculation rule. Rather it is often possible and/or required to applydifferent calculation rules to different codebook objects, or to modifyparameters of the same calculation rule for different codebook objects.Both variations can also occur dynamically during a single trainingprocess, or in a series of training processes. Specifically, methods anddevices are possible as well, where not always just one codebook objectis assigned to each data object of the ordering space. Rather, differentnumbers and kinds of codebook objects can be assigned to different dataobjects of the ordering space, whereby these numbers and kinds can alsobe chosen as dynamically variable, e.g. regarding the specific dataprocessing problem, the current training state of thetopology-preserving mapping, the mapping quality presently or previouslyachieved, or any additional constraints, e.g. those induced by the dataanalysis problem. Also data objects of the ordering space may exist, towhich permanently or temporarily no codebook objects are assigned. Animportant example of the dynamic, code-book-specific adaptation of thecalculation rule is the adaptation of the cooperativity function ofself-organizing maps, with regard to the measures of local topologypreservation, e.g. in the sense of methods like [7]

(n) Data-Object-Specific Variation of the Characteristics, e.g.Calculation Rule for Data Objects of the Exploration and/or OrderingSpace: The data-object-specific variability described in 1m is alsovalid in the same sense for different objects of the exploration spaceand/or of the ordering space, e.g. the calculation rule of thetopology-preserving mapping can vary in a data-object-specific way.Specifically, it can also be chosen as dynamically variable, e.g. withregard to the specific data processing problem, the current trainingstate of the topology-preserving mapping, the mapping quality presentlyor previously achieved, or any additional constraints, e.g. thoseinduced by the data analysis problem. In addition, data objects of theexploration, outcome and/or ordering space can be dynamically variable,regarding, for instance, the criteria just mentioned, e.g. new dataobjects can be dynamically created and existing data objects can bedynamically removed or in any way modified.

(o) Supervised XOM: The training of the topology-preserving mapping canbe performed in dependence of data objects or characteristics of dataobjects that are associated to data objects of the ordering space. Aninteresting case is if data objects of the ordering space are associatedto further data objects, which do not appear in the ordering space, orif data objects of the ordering space have additional characteristics,which are, permanently or temporarily, not taken into account for thedetermination of the ordering space. It is a specifically importantcase, if these additional data objects or characteristics of dataobjects are interpreted as function values specified for data objects ofthe ordering space. Here again, one case is specifically important,where these additional data objects or characteristics of data objectsare used to modify the exploration space, the ordering space, theoutcome space, or the data processing rule on which thetopology-preserving mapping is based, or its parameters in agoal-directed way, specifically in a data-object-specific way. Withthis, different XOM-based methods and devices for supervised learningcan be constructed, in particular for interpolation, extrapolation, andapproximation or for any other kind of function processing. It should bestressed that the data objects of the ordering space as well as theadditional data objects and object characteristics associated to them,can be input data as well as structure hypotheses.

(p) XOM under Additional Constraints: A specifically important variationof XOM consists in the training of the topology-preserving mapping beinginfluenced by additional constraints, which influence anycharacteristics of the exploration space, ordering space, or outcomespace, e.g. regarding the specific data processing problem, the currenttraining state of the topology-preserving mapping, the mapping qualitypresently or previously achieved, or any additional constraints, e.g.those induced by the data analysis problem. It is, for instance,possible to limit the movement of a proper or improper subset of thecodebook objects in the outcome space, statically or dynamically, or toinfluence it in any other way.

(q) Dynamically Variable Exploration Space, Growing XOM Mappings: XOMimplementations should be specifically emphasized, where the explorationspace or its data objects during a training process, or over a series oftraining processes of the topology-preserving mapping are influenced ina goal-directed or not goal-directed way, i.e. where they aredynamically variable, for instance regarding criteria of the currentlyor previously achieved local or global quality of thetopology-preserving mapping, e.g. in the sense of 2. Specificallyimportant is the case of XOM-mappings with growing, shrinking, dividingor in any other way locally or globally changing exploration spaces, forwhich successively structure hypotheses shall be improved, e.g. viainput data.

(r) Rescaling of the Distances in the Ordering Space, SparsenessAnnealing: Specifically interesting XOM methods and devices change thetopology of the ordering space during a training process, or over aseries of repeated training processes of the topology-preservingmapping, e.g. by a mathematically expressible calculation rule. Thisrule can depend, for instance, on a currently or previously achievedmapping quality or on the current number of training steps or trainingruns. If the topology of the ordering space is represented by a properor improper subset of the pairwise distances between the data objects ofthe ordering space, then an global resealing, i.e. referring to all theused distances, as well as a local, i.e. individually adapted, resealingof specific distances can be performed. With this rescaling, e.g. anincrease of the proportion of large distances to the total number of theutilized distances in the course of one or more training procedures ofthe topology-preserving mapping can be achieved.

If these large distances have little influence on the training of thetopology-preserving mapping, they can be neglected for the furthertraining. This corresponds to an increasing functional ‘sparsing’ of thedistance matrix, i.e. the number of distances to be considered in thetraining of the topology-preserving mapping decreases. This implies,among other things, a reduction of the computational expense. Thismethod is called “Sparseness Cooling” or “Sparseness Annealing”. One ofan arbitrary number of possible schemes is for instance the following:If d_(ij)(t) are the distances between the data objects i and j of theordering space at the training step t, and d_(ij) are the originaldistances, a scale change according to${d_{ij}(t)} = ( \frac{d_{ij}}{\sigma(t)} )^{\alpha}$where α>0, can achieve that d_(ij)(t)>d_(ij) for d_(ij)>σ(t). Here, σ(t)can be chosen as a function monotonically decreasing with t (e.g. in thesense of a cooling scheme as in equation (10)). By variation of α, thedegree of the non-linear distortion can be influenced. With therescaling rule${d_{ij}(t)} = {d_{ij}( {1 + ( \frac{d_{ij}}{\sigma(t)} )^{\alpha}} )}$and α>>1, the distances for d_(ij)<σ(t) stay almost constant, whereasfor d_(ij)>σ(t) they are clearly upscaled. Under certain circumstances,it can be useful to consider only distances in a certain interval [a, b]where a, bε

, a, b≧0 for the training of the topology-preserving mapping. For this,one could set, for instance:$ d_{ij}arrow\{ \begin{matrix}{0:{d_{ij} < a}} \\{d_{ij} - {a:{d_{ij} \in \lbrack {a,b} \rbrack}}} \\{\infty:{d_{ij} > {b.}}}\end{matrix}  $

It should be stressed that the rescaling of the distances is not limitedto this or similar calculation rules, but can be applied in aproblem-directed way, in any form regarding the given data processingsituation. Further, it is not necessary that rescalings depending on thetraining status of the topology-preserving mapping are recalculated inevery training step. Rather it can be sufficient to do this only after aseries of several training steps, which can result in a considerablereduction of the computational expense.

(s) Iteration: Methods and devices according to the XOM definition canalso be used iteratively, in the way that data objects of the outcomespace of a topology-preserving mapping, trained according to XOM, areused, fully or partially, to define, or, at least, to influence theordering space of a further topology-preserving mapping or of a newtraining step or training process of the same topology-preservingmapping. This should be explained in an example: If, for instance, anon-linear embedding of a data distribution in

^(k), used as ordering space, into a data distribution in

^(n) used as exploration space, is carried out by means of XOM, with k,nε

, then, in the simplest case, the outcome space resulting will be a setof codebook vectors in

^(n). These can now be used, directly or by use of an appropriatecalculation rule, to define the ordering space of a new XOM mapping,which, for instance, maps the topology induced by these codebook vectorsto a data distribution in

^(m), mε

. I.e., this topology is then used as exploration space of atopology-preserving mapping. From the outcome space of the firstapplication of XOM the ordering space of a further application of XOM isgenerated. This procedure can be iterated without limitation. In animportant special case, the outcome space or the exploration space, onone side, and the ordering space on the other, exchange their rolesmutually. This can also be performed iteratively. Of course, the choiceof the determining data objects, spaces, and distance measures is notsubject to any limitations.

(t) Self-Organization, Self-Regeneration, Self-Reproduction,Morphogenesis, Distributed Knowledge Representation: With XOM, efficientmethods and devices can be constructed, which possess characteristicproperties of living systems, specifically self-regeneration,self-reproduction, and self-stabilization, locality of informationprocessing or distributed knowledge representation. The followingexample illustrates the construction of such a system: The startingpoint are data objects, in the following denoted as “cells”. Theserepresent parts of a system, in the following denoted as “organism”. Acell owns the following data objects, characteristics, methods ordevices, in the following denoted as “elements”:

-   -   Information, necessary for partially or completely building the        ordering space of a topology-preserving mapping. This        corresponds to the “blueprint” of the organism. It does not have        to be contained completely in each cell.    -   Method and device, with the help of which the cell can determine        which data objects of the ordering space, exploration space        and/or outcome space of a topology-preserving mapping are        assigned to it, and in which way. The information gained by this        is in the following denoted as the “position” in the respective        space.    -   Method and device, with the help of which the cell can        communicate its positions in the ordering space, exploration        space, and/or outcome space or other information to other cells        and utilize such information communicated from other cells.    -   Method and device, with the help of which the cell can determine        a new position in the outcome space by use of a XOM mapping.    -   Method and device, with the help of which the cell can modify        its position in the outcome space, e.g. with regard to the new        position in the outcome space determined by use of a XOM        mapping.    -   Method and device, with the help of which the cell can check        and, if necessary, correct the consistency of its positions in        the different spaces.

Optionally, the following elements can also be present:

-   -   Method and device for the self-copying of the cell    -   Method and device for the self-destruction of the cell    -   Method and device for the modification of the data objects or        characteristics of the cell.

It should be stressed that a cell does not have to own all elementslisted above. It is also to be stressed that the elements listed do nothave to be represented “locally”, i.e. in each single cell. It is ratherpossible that the cell has access to global representations of theelements described above, i.e. representations related to more than onecell. This can, in particular, provide advantages for the constructionof XOM-based technical systems.

The “life”, i.e. the functional status of the organism, is thendetermined essentially by XOM. In the following, a typical example ofthis, a sequential procedure where all cells take part in all steps,will be outlined. However, these conditions do not have to be fulfilled,i.e. other procedures can be developed in analogy, where not all cellsparticipate in every step and/or parallel data processing takes place,i.e. the processing of several data objects at one time. The followingpresentation is motivated by the procedure for self-organizing maps. Itcan be, however, easily adapted to other topology-preserving mappings.

First, a data object of the exploration space of a topology-preservingmapping is chosen, in following called “stimulus”. All cells thendetermine their position in the exploration space in relation to thisstimulus. It is for instance possible that the cell determines thedistance (in any distance measure), between itself and this stimulus inthe exploration space. The cells exchange information about theirrelative position to this stimulus. The cells now compare thisinformation and determine one cell, whose position in the explorationspace corresponds best to the stimulus, the so-called “winner cell”.This winner cell then communicates its position in the ordering space toall other cells. The cells compare this information with their knowledgeof the ordering space. It is essential that this knowledge is locallyavailable in each cell (distributed knowledge representation). Thisknowledge corresponds to a “blueprint” of the organism, availablepartially or totally in each cell. By this, it is possible for the cellsto determine the position of the winner cell in the ordering space. Bycomparison of their own position in the ordering space to that of thewinner cell, the individual cells can determine their distance (in anydistance measure) from the winner cell in the ordering space. By meansof XOM, the cells then use this distance to update their position in theoutcome space. With the presentation of a new stimulus, the procedurestarts again.

It is important that the cells can determine their (relative) positionin the exploration space, ordering space, and outcome space, dependingon the topology-preserving mapping applied. This can be achieved e.g. byan information exchange in local surroundings of the cells in thedifferent spaces. The cells check the consistency of the positions theytake in the different spaces and correct these accordingly, continuouslyor occasionally.

The position determination in the different spaces can be realized, forinstance, as follows: All cells produce one or more “products”, e.g.fields, chemical substances or data objects of any kind. These spreadover a respective space according to suitable calculation rules.Eventually they decline, depending on the spatial and/or temporaldistance to their creation, or they change their properties. Because ofthe local determination of the resulting field intensities,concentrations of substances, or characteristics by the individualcells, these can determine their position in the respective space. Here,a space-direction determined integration over the concentration orcharacteristics of the products is conceivable, which can be performedin the individual cells or by information exchange in local cellcommunities. An elegant form of the position determination in theordering space can also be performed by XOM: Here, the positions of thecells in the exploration space or outcome space are used as orderingspace of a new topology-preserving mapping and vice versa. The roles ofexploration space, resp. outcome space and ordering space are thusswapped, in the sense of is. The training of the new topology-preservingmapping results in estimated values for the cell positions in theordering space, which can be compared to the positions in the orderingspace currently stored in the cells. Then, if necessary, these can beupdated.

Depending on such position determinations and consistency tests, localor global corrections of the cell number and cell characteristics in theorganism can be made. For this, if necessary, new cells can be createdor existing cells can be destroyed or modified. A global correctionmeasure is, for example, the total number of cells in comparison to thetotal number of data objects in the ordering space.

It is essential that such correction-motivating consistency measures canbe found from criteria of the mapping quality of topology-preservingmappings, as they are described, e.g., in 2. Here, measures for theassessment of the local or global topology preservation are specificallyimportant. If, for instance, a position determination in the orderingspace is carried out with XOM, as described above, then knowledge aboutthe strength and/or kind of local topology violations can be used toperform local corrections with regard to cell number and cellcharacteristics. If, vice versa, the position of a cell in theexploration space or outcome space is determined by XOM, then suchconsistency measures can also provide a basis for appropriatecorrections, e.g., in the sense of 1m, 1n or 1q.

A characteristic property of such XOM-based systems is that foridentical or similar cell equipment, each cell, in principle, can adoptand, in particular, also modify any position and any function in theorganism. The general structure and function of the organism remainsessentially unchanged. The individual cells in such processes anddevices thus remind of the “pluripotent stem cells” known in biology.This flexibility of the cells can, however, be limited, which can bedenominated “cell differentiation”, following the similarity tobiological systems.

The exploration space can be described best as “body”, in the sense of aspace occupied by the organism. If this body is variable, or subject toexternal influences, then the organism can stabilize itself or adapt tothe new conditions of the habitat with help of adaptive training of thetopology-preserving mapping according to XOM.

If parts of such XOM-based systems are removed or destroyed, the systemscan regenerate according to the processes described above. If suchsystems are divided into two or more parts, complete organisms candevelop from the single parts. These systems have thus the ability ofself-regeneration and self-reproduction, whereby self-regeneration canbe seen as a condition for self-reproduction by division of theorganism. A new system, with all characteristics of the completeorganism, including its form and shape, can develop from small parts ofthe organism as well as from few or single cells. Consequently, based onXOM, one can construct methods and devices with the ability ofmorphogenesis, or simulate self-organization processes in nature andtechnology. The protection claimed in this patent application refers toboth individual systems and ensembles of such systems.

Obviously, numerous extensions of such systems can be thought of:Principles from the fields of biology, especially genetics, orevolutionary computing can be applied to “breed” methods and devices asdescribed above, with specific characteristics, or to improve these, onthe level of single cells as well as on the level of complete organisms.In this sense, also a double or manifold representation of the orderingspace in each cell, as “blueprint” of the organism, could be thought of,in analogy to the diploid chromosome set in somatic cells of biologicalorganisms, or the possibility of sexual reproduction of organisms orcells with appropriate inheritance schemes.

Furthermore, it should be emphasized that the information processing inthe XOM systems described here can also be hierarchical. The training ofthe topology-preserving mapping, for instance, can go through differenthierarchical levels, e.g. by choice of subsets of the data objects ofthe ordering space to be represented, or sets of “prototypical dataobjects” obtained from the distribution of data objects in the orderingspace by application of suitable calculation methods. This can alsohappen, where appropriate, in dependence of the current system status ororganizational level, e.g. a suitably determined “life stage of theorganism”. For this, for instance, a vector quantization of the orderingspace can be performed. Specifically, it can be helpful to represent, inthe individual cells, information about different ordering spaces, to beused for the XOM mapping depending on the system status. Thus, or by useof similar methods, it is possible to first develop the basic structureof an organism and then, at later stages, its fine structure.

An essential criterion for a method or device in the sense of a XOMorganism as described above is that information about the ordering spaceis assigned to data objects of the ordering space, which exceeds thedata objects themselves, i.e. which is originally not included in theobject (e.g. about the topology of the ordering space). This informationserves as locally stored information on the structure of the wholesystem in the sense of a complete or incomplete “blueprint”. Thisblueprint can be used to create, remove or modify cells, as definedabove, or assigned data objects in the ordering space, outcome space,and/or exploration space.

(u) Hierarchical XOM: In XOM, the training of the topology-preservingmapping can go through different hierarchy levels, for instance bychoice of subsets of the data objects of the ordering space representedby the topology-preserving mapping or by sets of “prototypical dataobjects” obtained from the distribution of data objects in the orderingspace by application of appropriate calculation methods. These can becreated, for instance, by vector quantization of the ordering space.

(v) Dynamic XOM: As already mentioned, it is possible to modify the dataobjects or their distribution in the ordering space or in theexploration space during a training process, or over a series oftraining processes.

(w) Test Phase of an Already Trained XOM: Finally, after the training ofa topology-preserving mapping has been completed in the sense of XOM,new data objects can be added to the ordering space, exploration space,or outcome space and processed by use of this topology-preservingmapping without further training of this mapping. This can be done, forinstance, for the purpose of interpolation, extrapolation, embedding,hard or fuzzy clustering, classifying, supervised mapping by means offunctions or relations, visualization or sorting, or in the context ofprocesses related to self-organization or morphogenesis. Here, alsomethods according to the technical standard can be employed.

2. Quality Assessment: Here, the emphasis is set on methods and devicesfor the assessment of the local or global mapping quality of thetopology-preserving mapping employed according to 1. This quality can beexamined, for example, by:

(a) Topology and Distribution Preservation: For this, the determinationof local and global topology-preservation measures in accordance withthe technical standard can be performed, e.g. the so-called topographicproduct according to [1], or comparable measures, as described forexample in [24], chapter 10.3 and in the literature referred to in thispublication. A quality assessment can also be performed by analyses ofmeasures of the distribution-preservation, e.g. so-called “distortionexponents”, which can describe the density of the codebook objects inthe outcome space in relation to the density of the data objects in theexploration space or in the ordering space, e.g. in the sense of [8] or[35].

(b) Distortion Measures: The examination of the XOM mapping quality canbe performed by determination of distortion measures such as, forexample, the cost function of non-linear embedding methods, e.g. ofSammon's Mapping [40], of so-called “Minimal Wiring” cost functions[32], [11], or by comparative determination of the rank of nearestneighbors in the ordering space or outcome space, e.g. after thepresentation of a data object in the exploration space in the sense of[7]

(c) Distance Plot: The testing of the XOM mapping quality can beperformed by creation and/or analysis of so-called “distance plots”.Here, the distances between data objects in the outcome space (orexploration space) are graphically plotted against the distances ofcorresponding data objects in the ordering space, e.g. the pairwisedistances of the codebook vectors in the feature space of aself-organizing map against the pairwise distances of the correspondingposition vectors in the model cortex. However, the correspondingdistances in the different spaces can also be comparatively analyzedwithout a graphical representation. In the following, for reasons ofsimplicity, this case will also be treated as a “distance plot”. It isnot necessary to use all calculable pairwise distances in each space.The analysis can be performed, for instance, by qualitative visualobservation and interpretation of the distance plot, by calculation ofmeasures for the “width” of the distance plot, of the correlationmeasures between the distances in the different spaces, such ascorrelation coefficients or covariance, or by means of methods for theglobal (i.e. regarding all distance pairs) and local (i.e. regardingsingle distance pairs) dimension determination of the distance plot,e.g. in the sense of the Hausdorff dimension [18] orGrassberger-Procaccia dimension [16]. Specifically, it should bestressed, that such analyses can be performed selectively for differentdimensions of the distances in the exploration space, outcome space, andordering space. Specifically, a selective analysis of the distance plotallows the observation and quantitative evaluation of convolutionphenomena of the topology-preserving mapping for large distances (asdescribed in [36], chapter 14), as well as of local topology violationfor short distances. The measures cited above, or similar ones, can alsobe used as instruments for the comparative determination of thedimensions of data distributions in the different spaces.

(d) Outcome Plot and Exploration Plot: A quality assessment for XOM canalso be performed by creation and/or analysis of a plot of the codebookobjects in the outcome space or of the data objects in the explorationspace corresponding to these codebook objects, specifically if outcomespace and exploration space are identical. In particular, data objectsand/or topological relations of the data objects of the ordering spacecorresponding to the codebook objects can be visualized and analyzed bymeans of connection lines or other graphical aids (lines or graphicalobjects of different thickness, color, shade, structure or texture).Specifically protected are such representations in combination with therepresentation of data objects of the exploration space and/or outcomespace, or their topological relations. Here, also the representation ofthe local value of quality measures of the employed topology-preservingmapping, by means of any graphic tools, concerning distortion, topologyor distribution preservation as well as information obtained bysupervised learning from distance plots and quality assessments,deserves special emphasis. As the mentioned ways of representation arean essential aspect of the XOM-based exploratory analysis of theordering space and its topology, they are to be specifically protectedby this patent, in particular in cases where the ordering space isdetermined totally or partially by input data, or if the exploration orthe outcome space is determined totally or partially by structurehypotheses. The remarks in this section are generally valid for dataobjects and for data objects newly calculated from data objects or spaceregions assigned to data objects.

(e) Ordering Plot: Vice versa, data objects of the exploration spaceand/or outcome space can be visualized in the ordering space.Specifically, data objects and/or topological relations of the dataobjects in the exploration space or outcome space can be visualized andanalyzed by means of connection lines or other graphical aids (lines orgraphical objects of different thickness, color, shade, structure, ortexture). Such a representation is specifically protected when combinedwith the representations of data objects of the ordering space or theirtopological relations. Here, also the representation of the local valueof quality measures of the employed topology-preserving mapping, bymeans of any graphical tools, concerning distortion, topology ordistribution preservation as well as information obtained by supervisedlearning from distance plots and quality assessments, deserves specialemphasis. As the mentioned ways of representation are an essentialaspect of the XOM-based exploratory analysis of the exploration space,of the outcome space, or of their characteristics, they should bespecifically protected by this patent, in particular in cases where theexploration space or the outcome space is determined totally orpartially by structure hypotheses or if the ordering space is determinedwholly or partially by input data. The remarks in this section aregenerally valid for data objects and for data objects newly calculatedfrom data objects or space regions assigned to data objects.

(f) Quality Assessment with Supervised Learning: The XOM mapping qualitycan also be determined by the utilization of so-called supervisedlearning methods for the mapping of the different data spaces involvedin the topology-preserving mapping onto each other. Here, a learningmethod or a learning device is trained on pairs of data objects, a pairconsisting of one or more source data objects as well as of one or moretarget data objects. The source data objects are here taken from asource space, the target data objects from a target space. In aso-called test phase, after the training has been completed or is in anadvanced stage, if new source data objects, without the correspondingtarget data objects, are entered, an estimation of the assigned targetdata objects can be obtained, by taking into account the trainedmapping. Typical supervised learning methods are for example variousneural networks (e.g. Multilayer Perceptrons [38], Radial BasisFunctions Networks [33), Support Vector Machines [6] as well as numerousvariations of these methods), local models (e.g. [43], [29]), as localaverage models or local linear models, as well as all approximation orinterpolation methods described in the literature. Topology-preservingmappings can also be used for supervised learning, for instance bysplitting of the exploration space in source and target spaces ofself-organizing maps in accordance to the technical standard, or bysplitting of the ordering space in source and target spaces for XOM(refer also to 7). Starting from exploration space, ordering space, andoutcome space, any of these three spaces can serve, in principle, assource space or target space. If source and target space differ, then,at first, six possibilities for the supervised training of mappingsbetween the three data spaces exist. However, any concatenation of thesemappings can be thought of, whereby the source space can also correspondto the target space. The mapping quality of XOM can then be determinedin the test phase of the mapping that has been trained in a supervisedmanner, by determination of a suitably quantified mapping error, i.e.the difference between the actual and the expected value of target dataobjects. Here, any distance measures can be used. A measure often usedin vector spaces, in which a norm can be defined, is, for instance, thesum of the squared differences between actual and expected values. Bythis way, with the determination of the mapping error in learningmethods and learning devices trained in a supervised manner, the XOMmapping quality can be locally or globally determined. A special case isthe use of a concatenation of mappings trained in a supervised manner,whereby source and target spaces of the concatenation are identical.Here, the deviation of source and target data objects after a forwardand backward projection to and from a different data space can beobserved and analyzed.

(g) Quality Assessment by Use of Interpolation, Extrapolation orApproximation, Forward and Backward Projection: Finally, the qualityassessment in XOM can be performed by interpolation, extrapolation, orapproximation of data objects of the ordering space in the explorationspace or outcome space, or by interpolation, extrapolation, orapproximation of data objects of the exploration space or outcome spacein the ordering space, or by comparison of a data object of the orderingspace or exploration space to its image, after forward and backwardprojection by use of suitable methods of interpolation, extrapolation,approximation, or supervised learning, according to the technicalstandard or to this patent application.

(h) Quality Assessment by Trajectories or “Blobs”: It should beemphasized that in the presented and in other methods of the assessmentof the XOM mapping quality, data objects can also be presentedsequentially, e.g. data objects in the exploration space, whereby dataobjects presented consecutively over time have small mutual distances.The data presentation follows then “steady” trajectories in therespective data spaces, whereby proximity in time also implies proximityin space. A data presentation in form of “stimulation areas” movingthrough time and space, so-called “blobs”, is also possible. Undersuitable assumptions, e.g. steadiness assumptions, additional criteriafor the XOM-quality-assessment can be developed, e.g. by considerationof the methods mentioned, which also take into account the time dynamicsof the data presentation or are influenced by this.

3. Dimension Determination: With topology-preserving mappings, methodsand devices for the local or global dimension determination of datadistributions can be constructed, specifically also for datadistributions with a fractal local or a fractal global dimension. Thedimension determination is performed by mapping two data distributionsto each other by means of topology-preserving mappings, whereby onedistribution defines and influences the ordering space, the other theexploration space. By analysis of the characteristics of thetopology-preserving mapping, for example in the sense of the methods anddevices in 2, conclusions can be drawn about the dimensions of thedistributions employed, e.g. in the sense of a dimension comparison.Such methods and devices are an independent aspect of the presentinvention and independent of the XOM definition. They can, however, alsobe interpreted with regard to the functional and structural definitionsdescribed in section 2.1, if one precisely applies the dimension of the“input data” introduced there. Here, the following cases must bedistinguished:

-   -   (a) The dimension of the data distribution of the ordering space        is to be determined, the dimension of the data distribution of        the exploration space is known: The known dimension of the data        distribution of the exploration space and the data distribution        of the ordering space serve as input data in the sense of        “something given, with which something should be done”. By this        way, the XOM definition is applicable.    -   (b) The dimension of the data distribution of the exploration        space should be determined, the dimension of the data        distribution of the ordering space is known: The dimension of        the data distribution of the exploration space and the known        data distribution of the ordering space serve as input data. The        latter is thus an input object and the XOM definition is        applicable.    -   (c) The dimensions of both data distributions are unknown, only        a dimension comparison should be made: Both data distributions        are thus input data, specifically also the data distribution of        the ordering space. The XOM definition is, consequently,        applicable.

For the dimension determination by means of topology-preserving mappingdescribed above, in principle, any data distributions can be used. Thefollowing distributions can be listed here as specific referencedistributions with known or calculable dimension: (i) the fractalsdescribed and mentioned in [27], (ii) attractors of differentialequations and differential equation-systems, in particular “chaotic” and“strange” attractors, such as Lorenz attractor, R″øssler attractor,Ueda-Duffing attractor, the attractor of the Mackey-Glass differentialequation (differential-delay equation), etc., as well as (iii)attractors of iterative mappings, in particular “chaotic” and “strange”attractors, such as Sinai map, Circle map, Sine map, Shift map, Tentmap, Logistic map, Henon map, Chirikov map, etc. Regarding (ii) and(iii) all attractors described in the literature in chaotic systems andnon-linear dynamics can be used for the dimension determination by meansof topology-preserving mappings. A literature overview can be found forinstance at “http://www-chaos.umd.edu/publications/references.html”.Specifically regarding (i), it should be emphasized that, in many cases,a special procedure is needed to perform a dimension determination bymeans of topology-preserving mappings. In this sense, the Hausdorffdimensions given in [27] are often analytically calculated valuesreferring to the ideal fractal objects. These, in general, comprise aninfinite number of data points and, thus, cannot be simulated exactly indata processing devices. If one thus creates, for instance, self-similarpoint distributions, by use of recursive mapping rules over severalrecursion steps, according to the calculation rules in [27], then theresult is often a data distribution with a very large number of datapoints. By reducing the number of recursion steps, the number ofresulting data points becomes smaller, the resulting distribution,however, has other characteristics than the ideal fractal. Specifically,the Hausdorff dimension can differ considerably from the dimension ofthe ideal fractal. This is often caused by the fact that, over severalrecursion steps, the self-similarity substantially determines thefractal dimension. The trick is thus to first calculate the fractal overnumerous recursion steps and then to make a random selection of thecalculated data points. To experimentally determine the dimension of theresulting data distribution and to catch an eventual deviation from thetheoretically predicted value, the determination of thecorrelation-dimension according to Grassberger-Procaccia [16] isspecifically suitable. Data distributions, the dimensions of which canbe systematically “tuned” or adjusted, e.g. by modifying one or moreparameters in a specific range, are specifically suitable for thedimension determination by means of topology-preserving mappings. As anexample, the systems cited in [27] in this context can be mentioned,e.g. fractal carpets, sponges, foams, nets, grids, or Koch Islands andKoch Lakes as well as the Mackey-Glass differential equation(differential-delay equation) [26], for which the attractor dimensiondepends on the time delay. Some special cases of dimension determinationby means of topology-preserving mappings should still be mentioned here:

(a) Dimension Determination in the Ordering Space: Determination of thedimension of the data distribution in the ordering space of atopology-preserving mapping, based on the methods and devices listed initem 1, specifically by use of the methods and devices listed in item 2,specifically by repeated application of the methods and devices listedin 1, with data distributions in the exploration space having known,eventually different dimensions.

(b) Dimension Determination in the Exploration Space: Determination ofthe dimension of the data distribution in the exploration space of atopology-preserving mapping, based on the methods and devices listed initem 1, specifically by use of the methods and devices listed in item 2,specifically by repeated application of the methods and devices listedin 1, with data distributions in the ordering space having known,eventually different dimensions.

(c) Dimension Comparison: Dimension comparison, based on the methods anddevices listed in item 1, specifically also by means of exchange of thedata distributions of the ordering and exploration space, i.e.reciprocal embedding, whereby the dimension of one or both distributionscan be unknown, specifically by use of the methods and devices listed initem 2, specifically also by ordering of more than two datadistributions with regard to their dimension by pairwise dimensioncomparison.

It is important for the dimension determination with XOM that all XOMmodifications and evaluation techniques in this patent application,specifically those in 1 and 2 can be used.

4. Embedding: With XOM, effective methods and devices for non-linearembedding of data objects or their distributions in the ordering spaceinto any data objects or their distributions in the exploration spacecan be realized, specifically in accordance with the description initem 1. Typically, here the ordering space of a topology-preservingmapping is constructed from the input data to be embedded. In thesimplest case, the input data are used directly for this purpose. It is,however, possible to process the input data by some calculation rule,before entering them to the ordering space. It is also possible that theordering space is not completely defined by input data. The data finallyincluded in the ordering space are called original data. An arbitrarilychosen exploration space serves as embedding space, which is typicallydefined by structure hypotheses, but can also include, or be influenced,by input data. Embeddings with XOM can be useful for exploratory dataanalysis and/or visualization of the original data or for data reductionof the original data, in particular if the exploration space is chosensimpler and/or of lower dimension than the data space of the originaldata. It can, however, be helpful to choose the dimension of theexploration space higher than that of the ordering space, e.g. toperform dimension estimates or to observe convolution phenomena. Forthis, refer also to items 3 and 2. Essential aspects of embedding by XOMare:

-   -   (a) that, specifically, objects, distributions of data objects,        or data spaces can also be embedded, for which any distance        measures apply, i.e., in general, the distances between the data        objects of these distributions can be defined by any distance        measure, also by those that build no metric in a mathematical        sense. Here, particularly refer to items 1e, 1f, and 1g. The        embedding of any, in particular, also non-metrical data        distributions, which define the topology of the ordering space        of topology-preserving mappings with regard to any distance        measures, is a central aspect of the invention. A very important        special case is that of pairwise, eventually non-metrical,        dissimilarities of data objects.    -   (b) that, specifically, data objects, distributions of data        objects, or data spaces with fractal local or global dimension        can also be embedded, refer to the explanations in item 3.    -   (c) that, specifically, data objects, distributions of data        objects, or data spaces can also be embedded, which are        completely or partially defined by distance objects, for which,        for example, only pairwise distances, but no metrical embeddings        are known, specifically for the calculation of metric        embeddings.    -   (d) that, specifically, rescalings of distances in the ordering        space, in the sense of 1, e.g. as Sparseness Annealing, as well        as all XOM modifications described in this patent application        and, in particular, in 1, can be employed.    -   (e) that the result of the embedding can be visualized and        analyzed in a graphical display in form of an outcome or        exploration plot, according to item 2d, specifically also for        the purpose of non-linear principal component analysis or for        the visualization of solutions of optimization problems, or for        the visualization of data partitioning (clustering results).    -   (f) that the result of the embedding can be visualized and        analyzed in a graphical display in form of an ordering plot,        according to item 2e, specifically also for non-linear principal        component analysis or for the visualization of solutions of        optimization problems, or for the visualization of data        partitioning (clustering results).    -   (g) that a quality assessment of the embedding results can be        performed according to item 2. This can be used to improve        structure hypotheses regarding the choice of suitable        exploration spaces, in a goal-directed, eventually iterative        way.        5. Data Partitioning, Clustering: With XOM, efficient methods        and devices for hard or fuzzy partitioning or clustering of        distributions of data objects can be constructed, specifically        by taking into consideration the descriptions in item 1. In the        following, an example of a typical procedure is illustrated:    -   (a) Specify the input data distribution as ordering space.    -   (b) Specify a suitable structure hypothesis for the exploration        space. As an arbitrary example, here, a data distribution is        mentioned that is composed of several Gaussian distributions,        the parameters of which are determined ad-hoc. The centers of        the Gaussian distributions can have any specified topological        structure, for instance be ordered on a regular grid. Note that        there are no restrictions to the choice of the data        distributions in the exploration space, see, in particular, the        items listed below.    -   (c) Train the topology-preserving mapping.    -   (d) Assign the codebook objects, in a hard or fuzzy way, to the        individual data distributions of the exploration space, for        instance by calculation and comparison of distances of each        codebook object to the centers of the data distributions, which,        in this case have been specified as Gaussian distributions. By        definition of suitable distance measures, e.g. likelihood, this        can also be performed in a fuzzy way, as well as in cases in        which the outcome space and the exploration space are not        identical.        Essential aspects of data partitioning or clustering by XOM are        that    -   (a) specifically, also data objects, distributions of data        objects, and data spaces can be clustered, for which any        distance measures apply, i.e. the distances between the data        objects of these distributions can, in general, be defined by        any distance measures, also by measures that do not form a        metric in a mathematical sense. Here, refer specifically to        items 1e, 1f, and 1g. The clustering of any, specifically also        non-metric, data distributions, whereby these data distributions        determine the topology of the ordering space of        topology-preserving mappings in connection with any distance        measures, is a central aspect of the invention. A very important        special case is that of pairwise, eventually non-metric        dissimilarities of data objects.    -   (b) specifically data objects, distributions of data objects,        and data spaces with a fractal local or global dimension can be        clustered. Here, refer also to the remarks in item 3.    -   (c) specifically, also data objects, distributions of data        objects, and data spaces can be clustered, which are defined        totally or partially by distance objects, i.e. for which for        example only pairwise distances but no metric embedding are        known, in particular for the purpose of clustering in metric        embeddings.    -   (d) there are no restrictions to the choice of the data objects        and distributions as well as their parameters in the exploration        space.    -   (e) this can be performed, specifically, by training of the        topology-preserving map ping with a natural number of identical,        similar or different data distributions in the feature space,        with different centers or medians.    -   (f) this can be performed, specifically, by hard or fuzzy        assignment of single data objects to clusters, by use of a        criterion which refers to the distance (e.g. minimal distance),        in any distance measure, of the codebook object associated to        the data object from the centers or other characteristic points        (e.g. median) of the data distributions in the exploration        space, e.g. the likelihood (e.g. maximum likelihood) of the        positioning of the codebook object in the situation of a known        structure of the data distribution in the exploration space or        any other calculation rule based on the full or partial        knowledge of the distribution functions in the exploration        space.    -   (g) the choice of the data objects and distributions in the        exploration space can comprise specifically: simple geometric        objects (e.g. polygons, simple geometric bodies, line sections,        circles, rings, spheres, etc.), any characteristic        distributions, i.e. located uniform distributions, normal        distributions, Laplace distributions, Poisson distributions,        binomial distributions, hyper-geometric distributions, χ²        distributions, Student's t-distributions, Fisher        F-distributions, Gamma distributions, Fisher Z-distributions,        Kolmogorov-Smirnov λ-distributions (for definitions refer to        [5]), or single data objects in the sense of delta peaks.    -   (h) the centers or other local parameters of the data        distributions in the exploration space, e.g. the medians, can be        ordered in a pairwise equidistant manner, e.g. on a discrete        periodical grid.    -   (i) specifically, a proper or improper subset of the weights,        e.g. the number of data objects in each distribution or the        scattering measures (momenta) or other parameters of the data        distributions in the exploration space can be identical or        similar.    -   (j) specifically the centers of the data distributions in the        exploration space with specification of nε        distributions, can lie at the corners of a regular simplex in        the exploration space the dimension of which is at least n−1.    -   (k) specifically number, structure, localization, dimension,        relative or absolute weights or any other parameters of the data        distributions in the exploration space can be variable over a        training process or repeated training processes of the        topology-preserving mapping, in particular that these can be        varied in order to optimize a quality criterion according to        items 2 or 6. Specifically, the scattering measures of the        distributions can be chosen systematically variable over a        training process or successive training processes, e.g. to        facilitate an increasing focusing of the data objects to single        clusters, i.e. to reduce the entropy of the distribution of the        data objects to the clusters.    -   (l) specifically also rescalings of the distances in the        ordering space in the sense of 1r, e.g. as Sparseness Annealing,        as well as all XOM modifications listed in this patent        application, in particular in 1, can be employed.    -   (m) the clustering results can be visualized and analyzed by        graphical representation in form of an outcome or exploration        plot as described in item 2d. Here, in particular, such        representations are protected which characterize cluster        boundaries or cluster tessilations, or which mark the        affiliation of data objects to clusters by means of any        graphical aids.    -   (n) the clustering results can be visualized and analyzed by        graphical representation in form of an ordering plot according        to 2e. Here, in particular, such representations are protected        which characterize cluster boundaries or cluster tessilations,        or which mark the affiliation of data objects to clusters by        means of any graphical aids.    -   (o) a quality assessment of the clustering results can be made        according to item 2. Thereby, specifically structure hypotheses        with regard to suitable exploration spaces can be improved in a        goal-directed and, eventually iterative way.    -   (p) the clustering can be performed hierarchically, specifically        with regard to item 1q, e.g. by dynamically and successively        splitting distributions in the exploration space during a        training process or over a series of training processes of the        topology-preserving mapping.        6. Cluster Validity: The term cluster validity describes the        problem of defining appropriate structure hypotheses for the        data distributions to be clustered and/or evaluating the quality        of given partitionings of data regarding these structure        hypotheses, specifically regarding the number and/or relative        weight of the clusters, the selection of initialization        strategies and/or the selection of the employed clustering        method. For the problem of cluster validity as well as numerous        proposals for its solution, refer, for example, to [31].

An essential independent aspect of the invention is that, in contrast tothe technical standard, methods for the determination of the clustervalidity on dissimilarity data are proposed. Such a method can betechnically described as follows:

Data processing method for the determination of the cluster validity, inwhich data objects are entered, distance objects are entered and/orcalculated, as well as an assignment of the data objects to be processedto groups is entered and/or calculated, in particular according tomethods described in this patent application, where a measure of thequality of this assignment is delivered as output, whereby the measureof the quality of the assignment is calculated using at least a part ofthe entered and/or calculated distance objects. For the term “distanceobject” the definition above applies. It should be once more stressedthat particularly such distance measures are included in this definitionthat do not define any metric in a mathematical sense.

As a concrete realization of such methods, two procedures are proposed:

First, cluster validity measures can be developed for dissimilarity datathat are based on cost functions employed in methods for the clusteringof dissimilarity data. For examples of such cost functions, refer to theliterature on methods for the clustering of dissimilarity data,particularly [21], [13], [14], [15] as well as to the literature citedin these publications.

Methods and devices for the determination of cluster validity can bedeveloped, for example, by calculating second differences of the costfunctions used for the clustering of dissimilarity data, such as seconddifferences of the cost functions depending on the currently used numberof clusters. Relative or absolute maximum values of the magnitude ofthese second differences can be used as cluster validity criterion.

Second, efficient methods and devices for the assessment of clustervalidity can be constructed by XOM with respect to the hard and fuzzypartitioning or for the clustering of distributions of data objects,specifically in accordance to the descriptions in item 1, 5, and 2.

An example of the typical procedure for the determination of clustervalidity by XOM is described in the following:

-   -   (a) Define a cluster validity criterion, e.g. according to 2.    -   (b) Perform a clustering according to 5.    -   (c) Analyze the clustering results by comparison with respect to        this criterion.    -   (d) Modify the structure hypotheses for the clustering, i.e. the        data distributions in the exploration space chosen for the        clustering. Repeat the clustering and the analysis regarding the        criterion, eventually several times, e.g. with respect to        optimization of the clustering results with regard to the        criterion.

Simple and important examples of cluster validity criteria are themeasures for the analysis of the distortion, of the topology- anddistribution-preservation described in 2, as well as measures obtainedfrom distance plots or quality assessments by supervised learning.

Essential aspects of the cluster validity analysis by XOM are

-   -   (a) that it can be, specifically, performed for non-metric data        distributions as well. It can be performed for any data objects,        distributions of data objects, or data spaces, specifically for        those that can be clustered by XOM. The remarks in item 5        concerning this, are fully applicable.    -   (b) that it can be performed, in particular, based on all        methods and devices in item 2.    -   (c) that it can also be used to evaluate the quality of a given        data partitioning, i.e. of one that has not been obtained by XOM        clustering.    -   (d) that a visualization of such analyses can be performed by        means of exploration, outcome, and ordering plots in the sense        of 2. Here, in particular, also a visualization of a known or        calculated data partitioning is possible, for instance by        visualization of the assignment of data objects to clusters.        Additionally, a graphic representation of the cluster validity        measures depending on the structure hypotheses or on their        parameters is possible. Typically, cluster validity measures can        be represented depending, for instance, on the number of given        clusters.    -   (e) that in case of repetitive application of such analyses not        only the number of clusters, but any structure hypotheses, can        be modified, in particular schemes schemes of hierarchical        clustering, refer to item 5, can be applied.        7. Supervised Learning: By XOM, methods and devices for        supervised learning can be constructed, specifically for the        approximation or interpolation of functions, for time series        analysis or time series prediction, for smoothing or filtering.        In the supervised learning, a learning method or a learning        device is trained by use of pairs of data objects. A pair        includes one or more source data objects as well as one or more        target data objects. The source data objects are taken here from        a source space, the target data objects from a target space. In        a so-called test or working phase, after the training has been        completed or is in an advanced stage, if new source data        objects, without the corresponding target data objects, are        entered, an estimation of the assigned target data objects can        be obtained by using the trained mapping. Typical supervised        learning methods are for example different neural networks (e.g.        Multilayer Perceptrons (38], Radial Basis Functions-Networks        [33], Support Vector Machines [6] as well as numerous variations        of these methods), local models (e.g. [43], [29]), as local        average models or local linear models, as well as all        approximation or interpolation methods described in the        literature.

Supervised learning by XOM can be implemented by use of all aspectsdescribed in this patent application, e.g. using the description in 1 aswell as in combination with the use of interpolation or approximationmethods according to the technical standard.

Realization possibilities and essential aspects of supervised learningby XOM are

-   -   (a) that, with XOM, this can be performed, in particular, by        splitting of the ordering space into source and target space.        Typically, the ordering space is defined here as the product        space of the source and target space. Then, a representative        hyper-manifold of the data distribution is constructed within        this product space, according to 14, using XOM. In the working        phase, if the hyper-manifold is known, it is possible to        determine a target data object from a given source data object        by completing the coordinates of the point corresponding to the        source data object of the hyper-manifold in the target space.        This method can be used, for example, for function approximation        or function interpolation.    -   (b) that this can, in particular, be implemented by means of        methods and devices according to item 1o, for example for the        approximation or interpolation of functions.    -   (c) that this can be implemented, in particular, by use of        methods and devices for XOM clustering according to item 5.        Typically, the XOM clustering results are used here as an        additional input for methods and devices for supervised learning        according to the technical standard. A very important special        case is the use of XOM clustering results as input for the        training of radial basis functions-networks, according to item        21a.    -   (d) that this can be used, in particular, for the supervised        learning on metric or non-metric dissimilarity data, for        instance for the purpose of classification of such data. Here,        for example, XOM clustering according to 5 can be performed on,        eventually non-metric, dissimilarity data. The clustering        results could then be entered, for example, into the training of        a radial basis functions-network, e.g. in the sense of 21a.    -   (e) particularly in combination with the use of interpolation        and approximation methods according to the technical standard or        the other claims.        8. Registration: By XOM it is possible to realize methods and        devices for the registration of datasets, considering all items        of this patent application, particularly item 1 as well as        combinations of XOM with methods and devices according to the        technical standard. Specifically, a non-linear, non-affine,        locally distorting registration of data sets can be realized.

The simplest case is typically based on two data distributions. Theso-called “test data set” is to be registered on a “reference data set”.This is often similar the test data set according to criteria to bedefined suitably. Typically, test and reference data sets are given,both are thus input data in the sense of “something given, with whichsomething should be made”. In the simplest case, one of the data sets isused to define the ordering space of a topology-preserving mapping,while the other one is used to define its exploration space. In anycase, input data are used to partially or completely define the orderingspace. Therefore, the XOM definition is applicable.

After completed training of the topology-preserving mapping, the qualityof the registration result can be evaluated, specifically by means ofthe methods and devices in item 2.

Essential aspects of the registration by XOM are

-   -   (a) that it can be employed specifically for the registration        of, eventually multispectral, image data sets in 2D and 3D, as        well as of image series.    -   (b) that it can be used specifically for the registration of        time series or time functions, for example in the sense of a        Dynamic Time Warping (DTW). For the definition of DTW, refer to        e.g. [22].    -   (c) that it can be used specifically as pre-processing for any        further data processing tasks, e.g. classification or        clustering, in the sense of a “normalization”. Here, different        data sets, e.g. image data sets, are registered to a given        standard data set. If, for example, a classification problem, or        any other problem, was already fully or partially successfully        solved on the standard data set, this solution can be adopted        for the other data sets after the registration. An arbitrary        example of this is the segmentation of certain regions in image        data sets of the brain by registration of image data sets from        different individuals to a previously segmented “standard brain”        used as a standard data set.    -   (d) that by this, specifically measures for the local or global        similarity between different data sets can be obtained,        particularly by use of the methods and devices according to item        2.    -   (e) that before registration, a data reduction in the sense of a        vector quantization can also be performed.    -   (f) that boundary conditions or other additional constraints for        the registration can be enforced by so-called “topology        anchors”. These are additional data objects added to the data        sets to be registered. This is, in general, performed the        case (i) in the regions of the data sets which should be as well        adjusted as possible by the registration, (ii) similarly in the        data sets to be registered. These topology anchors are usually        chosen in a way that in case of their incongruent registration        one would expect high costs in the sense of mapping quality        measures, e.g. according to the criteria mentioned in 2.        9. Active Learning: By XOM it is possible to realize methods and        devices for so-called “active learning”, by reference to all        items of this patent application, specifically item 1, as well        as in combination of XOM with methods and devices according to        the technical standard. By this, a procedure is understood, in        which, during the training process of a learning procedure, the        selection of data objects out of the training data set for the        further training is directedly influenced by the current status        of the learning procedure, by use of suitable calculation        methods.

A typical example of the realization of active learning by XOM is asituation where the selection of data objects out of the explorationspace during the training process of the topology-preserving mapping isinfluenced by the current status of the topology-preserving mapping, byuse of suitable calculation methods, e.g. by the achieved global orlocal mapping quality, e.g. calculated by using the methods and devicesas described in item 2.

10. Molecular Dynamics Simulation: By XOM it is possible to realizemethods and devices for the so-called “molecular dynamics simulation”,by use of all items of this patent application, specifically item 1, aswell as in combination of XOM with methods and devices according to thetechnical standard. By this, calculations of the spatial and temporalstructure of molecules of fully or partially known composition as wellas the use of knowledge gained from these calculations are understood.Important examples are the analysis of the secondary or tertiarystructure of proteins or the analysis of the functional spatio-temporalstructure of active centers of enzymes. An essential invention in thiscontext is that for the molecular dynamics simulation “rigid”, i.e. thatcan be only changed by a relevant amount by strong external influences,spatial relations or constraints between the atoms of a molecule or itssurroundings are used to define the topology of the ordering space of atopology-preserving mapping. Typical examples of such rigid spatialrelations are link lengths and link angles in covalent links betweenatoms of a molecule. In the simplest case, each atom or group of atomsis assigned to a data object of the ordering space as well as to acodebook object.

By training of the topology-preserving mapping with XOM, interactionsbetween atoms or the surroundings can be modeled, whereby the analysisof the outcome space yields the searched structure of the molecule.Examples of procedures for such modeling are:

-   -   (a) Modeling of the interaction by codebook-specific variation        of the learning rule of the topology-preserving mapping, for        instance in the sense of item 1m. A simple example could be the        modeling of the learning parameter ε in a self-organizing map,        in dependence of the strength of the interaction, according to        equation (9). In analogy, a modeling can be thought of, where        the interaction between two atoms is not considered at every        learning step but less often, depending on the strength of the        interaction. In this way, also different degrees of “rigidity”        can be modeled regarding the spatial constraints mentioned        above.    -   (b) Iterative use of XOM, e.g. according to 1s. This can be        combined, in particular, with a procedure where the XOM        molecular dynamics simulation is divided into small simulation        steps, whereby in each simulation step only small changes in the        spatio-temporal molecular structure are modeled. At the end of a        simulation step, the outcome space is used as the new ordering        space of the topology-preserving mapping and the simulation is        continued. At this point, the original constraints in the        topology of the ordering space can be restored that were no        longer represented in an adequate manner in comparison to the        topology of the ordering space, in the context of topology        violations in the outcome space, during the previous simulation        step. Topology violations regarding the constraints can thus be        corrected. At the same time, new topological relations between        the atoms that can be derived from the result of the previous        simulation step, can be taken into consideration for the        modeling of the new ordering space. Specifically, procedures can        be thought of, where a continuous correction of local topology        violations is performed, e.g. regarding the criteria mentioned        in 2.        11. Robotics: In analogy to item 10, problem solutions can be        achieved in robotics, in particular in the field of inverse        kinematics.

In analogy to the procedure in the molecular dynamics simulation,“rigid”, i.e. that can only be changed by a relevant amount by strongexternal influences, spatial relations or constraints between thecomponents of a robot or between the robot and its surroundings are usedto define the topology of the ordering space of a topology-preservingmapping. Typical examples of such rigid spatial relations are the formand size of components of a robot or constraints regarding the relativemobility of its components against each other. In the simplest case, adata object of the ordering space as well as a codebook object isassigned to characteristic points of components or a localized group ofcomponents. characteristic points of All remarks in item 10 are thenapplicable in a completely analogous way.

12. Sorting: With XOM it is possible to realize methods and devices forthe sorting of data objects, e.g. as described in item 1. Here, theintended ordering of the data objects is represented by the topology ofthe ordering space. This can be performed, in particular, in situations,where only a proper subset of the possible pairwise ordering relationsbetween the data objects is known or calculable, or should be used forthe sorting.

13. Optimization: By XOM it is possible to realize methods and devicesfor finding solutions to optimization problems, by use of all items ofthis patent application, specifically item 1 as well as in combinationwith methods according to the technical standard.

Important aspects regarding the use of XOM for finding solutions tooptimization problems are that:

-   -   (a) this is, in particular, possible as well, if only a proper        subset of the calculable pairwise distances between the data        objects is used as input data.    -   (b) this is, in particular, possible as well, if the pairwise        distances between data objects do not form a metric.    -   (c) this can be, specifically, also used for finding solutions        of NP-hard optimization problems, e.g. of metric or particularly        non-metric Traveling Salesman Problems or similar mathematical        problems. In the Traveling Salesman Problem, for instance, the        position of the cities can determine the topology of the        ordering space, a ring-shaped uniform distribution can represent        the exploration space. The visualization of the solution can be        given as an exploration plot as well as, specifically, as an        ordering plot according to 2.        14. Construction of Hyper-Manifolds: With XOM, methods and        devices for the construction of approximating hyper-manifolds        and for non-linear principal component analysis can be        implemented, applying all items of this patent application,        specifically item 1 as well as in combination of XOM with        methods and devices according to the technical standard.

Important aspects regarding the use of XOM for the construction ofapproximating hyper-manifolds and for the non-linear principal componentanalysis are that:

-   -   (a) this can be done specifically by the calculation of        supporting points of the hyper-manifolds by using local,        eventually weighted, averaging, interpolation, or approximation        in the ordering space or outcome space after completed XOM        embedding, refer also to item 4. The XOM embedding for the        non-linear principal component analysis is made possible, for        instance, by the calculation of a path through the data objects        of the ordering space based on an embedding in a 1D-manifold in        the exploration space.    -   (b) here, specifically, also the size or structure of the local        areas chosen in this context can be variable, e.g. by use of        methods and devices according to item 2, for example, in order        to allow a local adjustment of the representation quality of the        hyper-manifold.    -   (c) specifically, the dimension or structure of the training        data set in the exploration space can also be locally or        globally variable during a training process, or over a series of        training processes. It can, for instance, be dynamically        adjusted by use of criteria for the determination of the global        or local topology preservation or dimension estimation,        according to items 2 or 3.    -   (d) specifically, the hypothetically assumed dimension or        structure of the representing hyper-manifolds in the ordering        space, or the ordering space itself, can be locally or globally        variable during a training process or over a series of training        processes. It can for instance be dynamically adjusted by use of        criteria for the determination of the global or local topology        preservation or dimension estimation, for instance according to        items 2 or 3.    -   (e) specifically, also in the sense of evolutionary computing        algorithms, structure hypotheses about data distributions in the        exploration space or representing hyper-manifolds in the        ordering space can be created, dynamically modified and/or        optimized, specifically by methods and devices according to 14c        or 14d, whereby single structure hypotheses can also be seen as        individuals. Here, specifically, also mutations can be        influenced by use of criteria for the determination of the        global or local topology preservation or dimension estimation,        for instance according to items 2 (here, in particular, also        item 2h) or 3.    -   (f) the visualization of the generated hyper-manifolds can be        performed directly in the ordering space or indirectly by their        embedding in the exploration space or outcome space. The        visualization is thus possible by means of exploration, outcome,        and ordering plots in the sense of item 2. Hereby specifically,        also the visualization of the local mapping quality on these        hyper-manifolds or their embeddings, according to item 2, can be        performed by color or other optical coding.        15. Interpolation, Extrapolation, Approximation: By XOM it is        possible to implement methods and devices for the interpolation,        extrapolation, or approximation of data distributions by use of        all items of this patent application, specifically of item 1 as        well as in combination of XOM with methods and devices according        to the technical standard.

Important aspects regarding interpolation, extrapolation, orapproximation with XOM are that

-   -   (a) this can be performed, specifically, by use of methods and        devices according to item 11.    -   (b) this is, specifically, possible as well, if the        interpolation, extrapolation, or approximation should be        performed for additionally entered data objects, after partial        or complete training of the topology-preserving mapping.    -   (c) these can be performed, specifically, by use of methods and        devices according to item 14.    -   (d) specifically, the data distributions in the exploration,        outcome, or ordering space of topology-preserving mappings        trained in the sense of XOM can be processed by use of methods        and devices according to the technical standard, e.g. neural        networks, local models or any other methods for the        interpolation, extrapolation, or approximation.        16. Self-Organization: By XOM, methods and devices for the        self-organization and morphogenesis of objects, structures and        systems of any kind, specifically technical systems, can be        realized which own abilities like self-regeneration,        self-reproduction, or decentralized information storage. This        can be performed by use of all items of this patent application,        specifically item 1, here, in particular, it as well as in        combination of XOM with methods and devices according to the        technical standard.        17. Relevance Learning: With XOM, methods and devices for the        determination of the relevance of data objects or components of        data objects for tasks of the data processing and data analysis        can be realized by determination of task-specific target        criteria, by use of all items of this patent application,        specifically item 1 as well as in combination of XOM with        methods and devices according to the technical standard. This        can be performed, for instance, by scaling of the single        dimensions when using vectorial input data, by selection of        specific data objects from the training data set for the        training of the topology-preserving mapping, e.g. for supervised        learning tasks, clustering, or the construction of representing        hyper-manifolds.        18. Visualization and Layout of Graphs: By XOM, methods and        devices for the visualization and for layout of graphs can be        implemented, by use of all items of this patent application,        specifically item 1 as well as in combination of XOM with        methods and devices according to the technical standard.

XOM can be used for the layout and visualization of graphs,specifically,

-   -   (a) if more than one data distribution in the exploration space        is used for the training.    -   (b) if it is not a uniform data distribution in the exploration        space which is used for the training.    -   (c) if the data objects or subsets thereof in the ordering space        do not satisfy any metric in a mathematical sense.    -   (d) if the data distributions in the exploration space used for        the training are not convex.    -   (e) if the data objects or subsets thereof in the ordering space        or in the exploration space do not satisfy the Euclidian        geometry or if their distance is defined by any dissimilarity        measures.    -   (f) if distances of any data objects are used for the training,        also if these are not connected by an edge, e.g. by use of        geodesic distances or a rank metric.    -   (g) if the topology-preserving mapping does not correspond to        the sequential formulation of a self-organizing map according to        Kohonen.    -   (h) if the distribution of the training data used for the        training of the topology-preserving mapping in the exploration        space has a dimension other than 2 or 3.    -   (i) if the distribution used for the training of the        topology-preserving mapping is not a sphere in 3D.    -   (j) if the training rule of the topology-preserving mapping for        the codebook objects assigned to the nodes can be different for        different nodes or codebook objects. For this, also refer to        item 1m.    -   (k) if not all connections for which the mutual distances are        known or have been calculated are used for the visualization of        the graph.        19. Applications: By XOM, methods and devices for applications        in the fields of circuit design, bio-informatics, robotics,        meteorology, image processing, technical self-organizing and        self-repairing systems, text mining, flight security, traffic        control and maintenance systems, coding, encrypting, security        technology can be constructed. This can be performed by use of        all items of this patent application, in particular item 1, here        specifically lt as well as in combination of XOM with methods        and devices according to the technical standard.        20. Combinations: The methods and devices listed in the single        items above can be combined in numerous ways. In this context,        the following should be specifically emphasized:    -   (a) Combination of dimension determination and embedding    -   (b) Combination of embedding and determination of approximating        hyper-manifolds    -   (c) Combination of clustering and cluster validity analysis    -   (d) Combination of embedding and clustering. Here, the embedding        can be used for dimension or data reduction.        21. Combination with Methods and Devices According to the        Technical Standard: The methods and devices listed in the single        items above can also be used in combination with methods and        devices according to the technical standard. The following        should be specifically emphasized:    -   (a) Combination of XOM clustering with methods and devices for        supervised learning, specifically for the creation of networks        in the sense of radial basis functions-networks with or without        normalization of the basis functions. Here, any distance measure        between the codebook objects and the localization parameters        (e.g. center, median) of the prototypical distributions of the        exploration space used for XOM clustering can be used for the        definition of the basis functions, e.g. the likelihood of the        positioning of the codebook objects regarding the prototypical        distributions.    -   (b) Combination of XOM embedding with methods and devices for        interpolation or approximation.        22. Visualization: Numerous methods and devices can be employed        for the visualization of input data, structure hypotheses, and        calculation results in XOM. The following should be specifically        stressed here:    -   (a) the visualization of the codebook objects in the outcome or        exploration space or their movement in the sense of an outcome        plot, refer to item 2    -   (b) the visualization of the training data distributions in the        exploration space in the sense of an exploration plot, refer to        item 2    -   (c) the visualization of the data objects of the exploration or        outcome space in the ordering space in the sense of an ordering        plot, refer to item 2    -   (d) the visualization of the mapping quality in a distance plot        or quantities derived from that, refer to item 2    -   (e) color coding or other graphical marking of the local        topology violation or other local criteria for the mapping        quality according to item 2 in the exploration, outcome,        ordering, or distance plot.        23. Mutual Connectivity Analysis: In the following, methods and        devices are described, which allow an innovative kind of data        processing based on dissimilarity data. The underlying method        will be denominated in the following “Mutual Connectivity        Analysis” (MCA). XOM represents an important method for the data        analysis in connection to MCA, see below.

First, an example of a typical technical procedure is presented forillustration. It should be stressed that this procedure is notrestricted to the kinds of data and calculation methods mentioned inthis example.

-   -   The starting point is a set of N time series Z={z₁, . . . ,        z_(N)}, e.g. in form of a set of data vectors in R^(M), Mε        , whereby each data vector z_(n), n=1, . . . , N represents one        time series and each element z_(nt) of the vector z_(n) the        value of the time series at time t, with tε{1, . . . ,M}.

For each time series a “sliding window” with the length 2p+1, pε

, p<M is then defined, which groups together 2p+1 chronologicallyneighboring values of the time series, with t−p≦t>t+p, or 2p+1successive elements of the data vectors representing the time series,whereby p is chosen equal for all time series. (For the beginning andfor the end of the time series heuristic conventions are made on how todefine the sliding window there.) The portions cut from the time seriesz_(n) in this way, or the vectors cut from the data vectors representingthe time series are in the following denoted as “window” x_(n)(t).

Now, two time series z_(r) and z_(s) are selected.

For all windows of these two time series a distance d(x_(r)(t),x_(s)(t)) is then determined, for instance by use of a supervisedlearning method. This can be done by determination of the predictionerror for each t, with which x_(s)(t) can be predicted from x_(r)(t)after completion of the training of the supervised learning method. Thesupervised learning method can be trained, for example, with a subset ofall pairs (x_(r)(t),x_(s)(t)). For this, the set of these pairs can besplit into training, test, and/or validation data sets, as is generalconvention for the application of supervised learning methods.Obviously, it can make sense, depending on the task, to determineinstead of d(x_(r)(t), x_(s)(t)) also d(x_(r)(t), x_(s)(t+τ)) with asuitable time offset τ.

-   -   By using a suitable calculation method, a distance        D_(rs)=D(x_(r), x_(s)) between the time series z_(r) and z_(s)        is calculated for the pairs (x_(r)(t), x_(s)(t)) selected as        test data set from the calculated d(x_(r)(t), x_(s)(t)). An        obvious calculation method for this is, for example, to compute        the average of the d(x_(r)(t), x_(s)(t)) for all t observed in        the test data set in the sense of a mean prediction error. Note        that, in general, D_(rs)≠D_(sr) applies.    -   The procedure can thus be repeated for all N² pairs of time        series, for example.    -   The resulting distance matrix of the distances between every two        time series can now be further processed in any way,        specifically by methods of data partitioning on dissimilarity        data, e.g. pairwise clustering according to the technical        standard, as in [21], [14], [10] or according to this patent        application, e.g. according to item 5 or by methods for the        classification on dissimilarity data, e.g. according to item 7d.

For the calculation of the mutual distances of time series, it ispossible to achieve a considerable speed advantage by using calculationrules that split this distance calculation into two steps, whereby onestep has to be performed only once for each time series and the otherone for every pair of time series.

This can be illustrated in the example above: The prediction of a timeseries z_(s) from another time series Zr can be done, for example, bytraining of a radial basis functions-network (e.g. as in [46]): here,first, the windows of the time series Zr are processed by vectorquantization. From this, prototypical time series windows result thatcan be called codebook vectors according to the introductory remarksabout vector quantization in section 1.1. It is essential that thevector quantization has to be performed only once for each time series.The codebook vectors are then used for the supervised training of theoutput layer of a radial basis functions-network (refer, for instance,to [46]), where the windows of the time series z_(s) serve as targetvalues for the supervised training. The training of the output layer ofthe radial basis functions-network must be repeated for every timeseries z_(s) to be predicted form the time series z_(r), i.e. a total ofN² times if all pairs of time series are taken into account. So, thevector quantization has to be performed N times, the training of theoutput layer of the radial basis functions-network N² times, if allpairs of time series are taken into account. In general, thecomputational expense for the vector quantization is considerably higherthan for the supervised training of the output layer of the radial basisfunction-network, which, therefore, results in a considerable speedadvantage for the entire procedure.

In analogy, it is possible to use local models, e.g. in [43], [29], e.g.local average models or local linear models instead of radial basisfunctions-networks. The following procedure can then be used for theprediction of a time series z_(s) from another time series z_(r): First,determine the k nearest neighbors of each window of the time seriesz_(r) among the other windows of this time series. This step only has tobe carried out once for each time series z_(r). In a second step, thesek nearest neighbors of the time series windows are interpolated orapproximated, for each time series to be predicted, according to thecalculation rule of the respective local model, whereby the windows ofthe time series z_(s) are used as target values for the supervisedtraining. The interpolation or approximation for the k nearest neighborsof the time series windows in the calculation rule of the local modelmust be repeated for every time series z_(s) to be predicted from thetime series z_(r), i.e. N² times, if all pairs of time series are takeninto account. The search of the k nearest neighbors of the time serieswindows must thus be performed N times, the interpolation orapproximation of the k nearest neighbors of the time series windows N²times, if all pairs of time series are taken into account. In general,the computational expense is considerably higher for the search of the knearest neighbors of the time series windows than for the interpolationor approximation according to the calculation rule of the local model,which, again, results in a considerable speed advantage for the entireprocess.

First, for illustration, an example of a typical technical procedureshould be presented.

It should be stressed that the concept of the MCA is not limited to thedata types and calculation rules for the analysis of time seriesmentioned in this example. Rather, the example above motivates thefollowing technical procedure.

First, some terms should be defined:

Data Objects are data without any limitation, e.g. sets, numbers,vectors, graphs, symbols, texts, images, signals, mathematical mappingsand their representations, e.g. matrices, tensors, etc. as well as anycombination of data objects.

Sub Data Objects are data objects that do not contain the completeinformation of the data objects, i.e. the original data object, ingeneral, cannot be fully calculated from the knowledge of a sub dataobject.

Distance Objects are data objects that characterize similarity relationsor distances between data objects, according to any distance metric.Here, distance measures induced by metrics as well as, specifically,similarity relations or dissimilarities defined by any distance measuresthat are eventually not determined by a metric, are included. Sometypical distance measures on the basis of dissimilarities are, e.g.,mentioned in [19]. Metric is here defined in the mathematical sense,refer e.g. to [5).

Sub Distance Objects are distance objects between sub data objects,specifically those of different data objects.

For reasons of clarity, it should be mentioned that the followingcorrespondences could be chosen with regard to the example above: Dataobject corresponds to time series. Sub data object corresponds towindow. Sub distance object corresponds to distance between time serieswindows of different time series.

Technical procedure, MCA:

A set of data objects is given. Specifically, the set of data objectscan also contain exact copies of the data objects.

-   -   Enter the set of data objects into the data processing method or        device.    -   Divide the data objects into sub data objects. The division does        not have to be disjoint nor complete.    -   Calculate distance objects between sub data objects of the        entered set of data objects. These distance objects are called        sub distance objects.    -   Calculate, by use of the sub distance objects, new distance        objects that represent the distances between the data objects of        the entered set of data objects.

Deliver the distance objects computed from this.

It is essential that the calculation of the distance objects between thesub data objects or data objects can be performed by means of anymethods for interpolation, extrapolation, and/or approximation. Inparticular, among these methods are:

-   -   (i) Statistical Learning Methods of any kind, specifically those        requiring supervised learning, particularly neural networks and        Support Vector Machines, Bayes networks, Hidden Markov Models,        Observable Operator Models (e.g. [23]). Among the neural        networks, the following should be specifically mentioned:        Multilayer-Perceptron in all variants described in the        literature, specifically those with training by        error-back-propagation, radial basis functions-networks in all        variants described in the literature, specifically also        generalized radial basis functions-networks, ART networks, Local        Linear Mappings (LLM) (refer, for example, to [36]) in all        variants described in the literature, as well as other neural        networks allowing supervised learning, such as        topology-preserving mappings, self-organizing maps as well as        XOM.    -   (ii) Local Models of any kind: local average models (also with        weighting), local linear models, local models with additional        topological constraints (e.g. [43]), specifically adaptive local        models with parameters depending on the respective learning        success achieved (for a literature overview refer e.g. to [43],        [30], [29]).    -   (iii) Methods of Inferential Statistics, specifically if test        statistics or levels of significance of statistical tests are        used as distance measures [39].    -   (iv) Special Calculation Methods, such as Levenstein distance,        Mutual Information, Kullback-Leibler Divergence, coherence        measures employed in signal processing, specifically for        biosignals, e.g. [42], (41], LPC cepstral distance, distance        measures that compare the power spectra of two signals, as for        instance the ItakuraSaito distance (refer to [22]), the        Mahalanobis-Distance, distance measures regarding the phase        synchronization of oscillators, e.g. [37].

One variant of this procedure should be specifically mentioned. Forreasons of clarity, it should be mentioned that with regard to theexample above for the analysis of a set of time series the followingcorrespondences could be chosen: An auxiliary data object corresponds,for instance, to a codebook vector in the vector quantization of thetime series windows of a time series in the prediction of time series byuse of radial basis functions-networks. Alternatively, an auxiliary dataobject corresponds, for instance, to a set of k nearest neighbors of atime series window in the time series prediction by use of local models.

Variant of the technical process flow, MCA:

A set of data objects is given. Specifically, the set of data objectscan also contain exact copies of the data objects.

-   -   Enter the set of data objects into the data processing method or        device.    -   Divide the data objects into sub data objects. The division does        not have to be disjoint nor complete.    -   Calculate new data objects for the sub data objects of single        data objects, the so-called auxiliary data objects.    -   Calculate, by use of auxiliary data objects, distance objects        between sub data objects of the entered set of data objects.        These distance objects are called sub distance objects.    -   Calculate, by use of sub distance objects, new distance objects        that characterize the distances between the data objects of the        entered set of data objects.    -   Deliver the distance objects computed from this.

Specifically, only the sub data objects of one single entered dataobject and/or more than one sub data object can be used for thecalculation of an auxiliary data object.

In the following, the output distance objects can be analyzed. For this,methods and devices according to the technical standard and/or to thedescriptions in this patent application are suitable. In this context,the following should be specifically mentioned: methods and devices forclustering and/or for supervised learning, in particular for pairwiseclustering of dissimilarity data, e.g. [21], [14], [10], methods anddevices for XOM clustering according to item 5, as well as forsupervised learning, e.g. classification on dissimilarity data, e.g.[15], methods and devices for supervised learning on dissimilarity databy XOM according to item 7d.

As application examples should be mentioned: data processing, e.g.clustering, of financial time series, such as stock prices, processingof data, e.g. time series, from the fields of economy, finance,medicine, natural sciences and/or technology, specifically ordered dataobjects, e.g. time series of laboratory values or other measurements ofbiomedical or meteorological research methods, e.g. biomedical images,gene expression-profiles, gene- or amino acid sequences.

For the time series example above it is clear that the definitions ofdata types and calculation methods made there do not imply anyrestrictions with regard to the general technical procedures.Specifically, any data objects, e.g. ordered data objects such as imagesor gene sequences, can be used instead of time series. In the analysisof time series, the values of the time series do not have to be measuredequidistantly; it is not necessary to use sliding windows or supervisedlearning methods for the analysis of the output data, etc.

1.-17. (canceled)
 18. A method of processing data for the mapping ofinput data to output data, the method to be executed on a dataprocessing device and comprising the following steps: (a) providing dataobjects to be processed as input data; (b) processing provided dataobjects by using a topology-preserving mapping, by: (i) ordering neuronsin ordering space, according to a given pattern; (ii) assigning codebookobjects in outcome space to the neurons; (iii) processing codebookobjects according to the calculation rule of a topology-preservingmapping, by use of data objects of the exploration space; and (iv)outputting the processed codebook objects as output data; said methodcharacterized by comprising at least one of the following steps; (c)determining the order of neurons in the ordering space by using at leasta part of the provided data objects, and (d) providing data objects,which are required for the data processing, which are independent of theinput data to be processed and which are used as data objects of theexploration space.
 19. The method of claim 18, wherein the data objectsto be processed are distance objects.
 20. The method of claim 18,wherein data objects in the ordering space are ordered irregularly. 21.The method of claim 18, wherein data objects of at least one of theordering space, exploration space, and outcome space are used whichcomply with at least one of the following conditions: (A) they satisfy anon-Euclidian geometry; (B) they are distance objects to data objects ofa local neighborhood of data objects; (C) they represent datadistributions with a fractal dimension; (D) they represent datadistributions of non-orientable surfaces in the sense of differentialgeometry; (E) they are added, omitted or modified during the trainingprocesses or a series of training processes of the topology-preservingmapping, in particular for distance objects in the ordering space; (F)they are influenced by additional constraints; (G) they are saved orprocessed in local units; and (H) they are added, omitted or modifiedafter completion of the training of the topology-preserving mapping. 22.The method of claim 18, wherein at least one of the calculation rule ofthe topology-preserving mapping and at least one parameter of thiscalculation rule: is chosen depending on the respective processed dataobject of at least one of the ordering space, exploration space andoutcome space; is modified during the training process or over severaltraining processes of the topology-preserving mapping, in particulardepending on the respective processed data object of at least one of theordering space, exploration space, and outcome space; and is influencedby additional constraints.
 23. A data processing device for carrying outthe method of claim
 18. 24. A computer program product, which is storedin a memory medium and contains software code segments, configured forcarrying out the method of claim 18 if the computer program product isrun on a data processing device.
 25. A method of processing data for themapping of data objects to be processed to distance objects, the methodto be executed on a data processing device and comprising the followingsteps: (a) providing data objects to be processed; (b) calculatingdistances between the data objects to be processed as distance objects;and (c) outputting these distance objects as output data; said methodcharacterized by the step of: (d) calculating the distances by use of atleast one of statistical learning methods, local models, methods ofinferential statistics, and one of the following specific computationmethods: (A) Levenstein Measure; (B) Mutual Information; (C)Kullback-Leibler Divergence; (D) coherence measures employed in signalprocessing, in particular for biosignals; (E) LPC cepstral distance; (F)calculation methods that relate the power spectra of two signals, suchas the Itakura-Saito Distance; (G) the Mahalanobis-Distance; and (H)calculation methods relating to the phase-synchronization ofoscillators.
 26. A data processing device for carrying out the method ofclaim
 25. 27. A computer program product, which is stored in a memorymedium and contains software code segments, configured for carrying outthe method of claim 25 if the computer program product is run on a dataprocessing device.
 28. A method of processing data for the determinationof the cluster validity, the method to be executed on a data processingdevice and comprising the following steps: (a) providing data objects asinput data; (b) providing distance objects between these data objects;(c) providing an assignment of the data objects to be processed togroups by: (i) processing provided data objects by using atopology-preserving mapping, by: (1) ordering neurons in ordering space,according to a qiven pattern; (2) assigning codebook objects in outcomespace to the neurons; (3) processing codebook objects according to thecalculation rule of a topology-preserving mapping, by use of dataobjects of the exploration; (4) outputting the processed codebookobjects as output data; (ii) at least one of the following substeps (1)and (2): (1) determining the order of neurons in the ordering space byusing at least a part of the provided data objects; (2) providing saiddata objects that are independent of the input data to be processed andwhich are used as data objects of the exploration space; and (d)outputting a measure of the quality of this assignment as output data,said method characterized by the step of: (e) calculating the measure ofthe quality of the assignment by employing at least a part of theprovided distance objects.
 29. The method of claim 28 wherein step (e)comprises the steps of: (f) providing data objects to be processed asinput data; (g) processing provided data objects by using atopology-preserving mapping; and (h) applying a cost function of amethod for the clustering of dissimilarity data, wherein the measure ofthe quality of the assignment is calculated by using at least one set ofthe set of substeps (h) (i) and h(ii) and the set of substeps (h)(iii)(h) (vi) and a cost function of a method for the clustering ofdissimilarity data: (i) processing provided dissimilarity data objectsby using a topology-preserving mapping, by: (1) ordering neurons inordering space, according to a given pattern; (2) assigning codebookobjects in outcome space to the neurons; (3) processing codebook objectsaccording to the calculation rule of a topology-preserving mapping, byuse of data objects of the exploration; (4) outputting the processedcocebook objects as output data; (ii) at least one of the followingsubsteps (1) and (2) (1) determining the order of neurons in theordering space by using at least a part of the provided dissimilaritydata objects; and (2) providing said dissimilarity data objects that areindependent of the input data to be processed and which are used as dataobjects of the exploration space; and (iii) providing dissimilarity dataobjects to be processed; (iv) calculating distances between thedissimilarity data objects to be processed as distance objects; (v)outputting these distance objects as output data; (vi) calculating thedistances by use of at least one of statistical learning methods, localmodels, methods of inferential statistics, and one of the followingspecific computation methods: (A) Levenstein Measure; (B) Mutualinformation; (C) Kullback-Leibler Divergence; (D) coherence measuresemployed in signal processing, in particular for biosignals; (E) LPCcepstral distance; (F) calculation methods that relate the power spectraof two signals, such as the Itakura-Saito Distance; (G) theMahalanobis-Distance; and (II) calculation methods relating to thephase-synchronization of oscillators.
 30. The method of claim 28, whichis carried out repeatedly, wherein the output data of a previous run ofthe procedure are entered as input data of a subsequent run of theprocedure.
 31. The method of claim 28, comprising the step of: (f)determining the quality of the output data and outputting thisdetermined quality.
 32. The method of claim 31 wherein the quality isdetermined by at least one of: (A) calculating measures fortopology-preservation or distribution-preservation; (B) calculatingdistortion measures; (C) relating the distance of data objects in theordering space to the distances of corresponding data objects in atleast one of the outcome space and the exploration space, in particularby plotting these data objects in a distance plot; (D) graphicallydisplaying data objects of at least one of the exploration space, theoutcome space and the ordering space, in particular by applying thesedata objects to at least one of an exploration, outcome and orderingplot; (E) graphically displaying data objects calculated from dataobjects of at least one of the exploration space, outcome space andordering space, in particular by plotting these object data ill at leastone of an exploration plot, outcome plot and ordering plot; (F)calculating and outputting the mapping error for at least one of aninterpolation, extrapolation, approximation and supervised learning, inparticular by forward and backward projection; and (G) sequentialprocessing of data objects.
 33. The method of claim 31, wherein thedetermined quality is used for at least one of: (A) adding, omitting ormodifying data objects of at least one of the exploration space, theoutcome space and the ordering space of the topology-preserving mapping;and (B) modifying at least one of the calculation rule of thetopology-preserving mapping and its parameters, in particular dependingon data objects of at least one of the exploration, outcome and orderingspace.
 34. The method of claim 28 which is used for at least one of thefollowing: (A) for dimension determination, in particular for thedetermination of fractal dimensions; (B) for non-linear embedding, inparticular of non-metric data and/or dissimilarity data; (C) forclustering, in particular of non-metric data and/or dissimilarity data;(D) for determining the cluster validity, in particular of dissimilaritydata and/or non-metric data; (E) for supervised learning, in particularon non-metric data or dissimilarity data; (F) for the registration ofdata sets; (G) for active learning; (H) for sorting; (I) for theoptimization, in particular for non-metric data or dissimilarity data;(J) for finding solutions of Traveling Salesman Problems and equivalentproblems, in particular non-metric Traveling Salesman Problems; (K) forthe calculation of hyper-manifolds; (L) for interpolation,extrapolation, or approximation; (M) for relevance learning; (N) for thevisualization of graphs; (O) for graph layout; and (P) for theconstruction of self-developing, self-repairing, and/or self-reproducingsystems, in particular of technical systems.
 35. The method of claim 34which is used for at least one of the following: (Q) dimensiondetermination and non-linear embedding; (R) non-linear embedding andcalculation of hyper-manifolds; (s) clustering and determination ofcluster validity; and (T) non-linear embedding and clustering.
 36. Themethod of claim 34 which is used for at least one of the following: (Q)the molecular dynamics simulation, in particular where constraints, inparticular rigid spatial relations, in the molecule or its surroundings,are modeled as distances of the neurons in the ordering space; (R) theproblem solving in the field of robotics, in particular whenconstraints, notably rigid special relations, in the robot or itssurroundings, are modeled as distances of the neurons in the orderingspace; and (S) data in the fields of economics, finances, medicine,humanities, natural sciences, or technology, in particular in the fieldsof circuit design, bio-informatics, robotics, meteorology, imageprocessing; (T) in the field of data-mining, in particular text-mining;(U) in the field of security technology, specifically flight or accesssecurity; (V) in the field of logistics, in particular traffic controland maintenance systems; and (W) in the fields of communicationtechnology or cryptology.
 37. A data processing device for carrying outthe method of claim
 28. 38. A computer program product, which is storedin a memory medium and contains software code segments, configured forcarrying out the method of claim 28 if the computer program product isrun on a data processing device.